You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Ahmed Abdeen Hamed <ah...@gmail.com> on 2008/07/06 06:40:18 UTC
ConceptMapper: Dictionary application to certain parts of a document
Hello,
I have a quick question about the ConceptMapper project. How can I apply
dictionary terms to a certain part of a document? For example, if you have
documents that have titles and abstracts and you need only to find terms
that appear in the abstract not the title, how do you do that? Also, if you
would like to apply a filter such as detecting a certain POS like names vs
verbs. How would you approach this problem? Are there examples that I can
take a look at? Please let me know if you have an answer for me.
Thanks in advance!
Ahmed
Re: ConceptMapper: Dictionary application to certain parts of a document
Posted by Ahmed Abdeen Hamed <ah...@gmail.com>.
Thank you Dr. Bill. That sounds very much like what I wanted. I will let you
know if I get stuck with implementing it.
Best wishes,
Ahmed
On Mon, Jul 7, 2008 at 2:45 PM, J. William Murdock <bi...@murdocks.org>
wrote:
> Here is a solution that may be a bit inefficient, but fits well into the
> framework. Produce an aggregate with three elements:
>
> 1) An annotator that takes the text you want to analyze (e.g.,
> abstractAnnotation.getCoveredText()) and copies it into a new sofa.
>
> 2) ConceptMapper; your main aggregate should specify a sofa mapping to make
> it run on the sofa that the first annotator created.
>
> 3) An annotator that copies all of the annotations from the sofa that the
> first annotator created to the default sofa. When copying, the annotations,
> it adds the begin offset of the text that was analyzed (e.g.,
> abstractAnnotation.getBegin()).
>
> --
> Bill Murdock, PhD
> UIMA User (but not a UIMA developer or official spokesperson)
> IBM Watson Research Center
> 19 Skyline Dr., Hawthorne, NY 10532 USA
> http://bill.murdocks.org
>
>
>
> Ahmed Abdeen Hamed wrote:
>
>> Hi David,
>> Thank you for your response. I actually wrote annotators that find useful
>> things. Is there a way you can get access to those annotators from your
>> aggregate analysis engine that get produced by UIMAFramework? I could do a
>> work around and only pass the text that I am interested in parsing.
>> However,
>> my solution is required to be within the UIMA framework.
>> Thanks again!
>> Ahmed
>>
>> On Mon, Jul 7, 2008 at 2:11 PM, David Buttler <bu...@llnl.gov> wrote:
>>
>>
>>
>>> This seems very straight-forward to me. My approach may not be the most
>>> efficient, but I would
>>> 1) write a wrapper around the ConceptMapper code so that you only pass it
>>> spans of text that you would find useful. 2) write a post processing
>>> filter
>>> that throws away any tag that occurs in a region of the text that you
>>> think
>>> is inappropriate (e.g. if you do not want to tag a verb)
>>>
>>> All of this would most easily be put into a single processing component
>>> so
>>> you don't have unwanted annotations in your CAS
>>>
>>> Dave
>>>
>>>
>>>
>>> Ahmed Abdeen Hamed wrote:
>>>
>>>
>>>
>>>> Hello,
>>>> I have a quick question about the ConceptMapper project. How can I apply
>>>> dictionary terms to a certain part of a document? For example, if you
>>>> have
>>>> documents that have titles and abstracts and you need only to find terms
>>>> that appear in the abstract not the title, how do you do that? Also, if
>>>> you
>>>> would like to apply a filter such as detecting a certain POS like names
>>>> vs
>>>> verbs. How would you approach this problem? Are there examples that I
>>>> can
>>>> take a look at? Please let me know if you have an answer for me.
>>>> Thanks in advance!
>>>> Ahmed
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>
Re: ConceptMapper: Dictionary application to certain parts of a document
Posted by "J. William Murdock" <bi...@murdocks.org>.
Here is a solution that may be a bit inefficient, but fits well into the
framework. Produce an aggregate with three elements:
1) An annotator that takes the text you want to analyze (e.g.,
abstractAnnotation.getCoveredText()) and copies it into a new sofa.
2) ConceptMapper; your main aggregate should specify a sofa mapping to
make it run on the sofa that the first annotator created.
3) An annotator that copies all of the annotations from the sofa that
the first annotator created to the default sofa. When copying, the
annotations, it adds the begin offset of the text that was analyzed
(e.g., abstractAnnotation.getBegin()).
--
Bill Murdock, PhD
UIMA User (but not a UIMA developer or official spokesperson)
IBM Watson Research Center
19 Skyline Dr., Hawthorne, NY 10532 USA
http://bill.murdocks.org
Ahmed Abdeen Hamed wrote:
> Hi David,
> Thank you for your response. I actually wrote annotators that find useful
> things. Is there a way you can get access to those annotators from your
> aggregate analysis engine that get produced by UIMAFramework? I could do a
> work around and only pass the text that I am interested in parsing. However,
> my solution is required to be within the UIMA framework.
> Thanks again!
> Ahmed
>
> On Mon, Jul 7, 2008 at 2:11 PM, David Buttler <bu...@llnl.gov> wrote:
>
>
>> This seems very straight-forward to me. My approach may not be the most
>> efficient, but I would
>> 1) write a wrapper around the ConceptMapper code so that you only pass it
>> spans of text that you would find useful. 2) write a post processing filter
>> that throws away any tag that occurs in a region of the text that you think
>> is inappropriate (e.g. if you do not want to tag a verb)
>>
>> All of this would most easily be put into a single processing component so
>> you don't have unwanted annotations in your CAS
>>
>> Dave
>>
>>
>>
>> Ahmed Abdeen Hamed wrote:
>>
>>
>>> Hello,
>>> I have a quick question about the ConceptMapper project. How can I apply
>>> dictionary terms to a certain part of a document? For example, if you have
>>> documents that have titles and abstracts and you need only to find terms
>>> that appear in the abstract not the title, how do you do that? Also, if
>>> you
>>> would like to apply a filter such as detecting a certain POS like names vs
>>> verbs. How would you approach this problem? Are there examples that I can
>>> take a look at? Please let me know if you have an answer for me.
>>> Thanks in advance!
>>> Ahmed
>>>
>>>
>>>
>>>
>>
>
>
Re: ConceptMapper: Dictionary application to certain parts of a document
Posted by Ahmed Abdeen Hamed <ah...@gmail.com>.
Hi David,
Thank you for your response. I actually wrote annotators that find useful
things. Is there a way you can get access to those annotators from your
aggregate analysis engine that get produced by UIMAFramework? I could do a
work around and only pass the text that I am interested in parsing. However,
my solution is required to be within the UIMA framework.
Thanks again!
Ahmed
On Mon, Jul 7, 2008 at 2:11 PM, David Buttler <bu...@llnl.gov> wrote:
> This seems very straight-forward to me. My approach may not be the most
> efficient, but I would
> 1) write a wrapper around the ConceptMapper code so that you only pass it
> spans of text that you would find useful. 2) write a post processing filter
> that throws away any tag that occurs in a region of the text that you think
> is inappropriate (e.g. if you do not want to tag a verb)
>
> All of this would most easily be put into a single processing component so
> you don't have unwanted annotations in your CAS
>
> Dave
>
>
>
> Ahmed Abdeen Hamed wrote:
>
>> Hello,
>> I have a quick question about the ConceptMapper project. How can I apply
>> dictionary terms to a certain part of a document? For example, if you have
>> documents that have titles and abstracts and you need only to find terms
>> that appear in the abstract not the title, how do you do that? Also, if
>> you
>> would like to apply a filter such as detecting a certain POS like names vs
>> verbs. How would you approach this problem? Are there examples that I can
>> take a look at? Please let me know if you have an answer for me.
>> Thanks in advance!
>> Ahmed
>>
>>
>>
>
>
Re: ConceptMapper: Dictionary application to certain parts of a document
Posted by David Buttler <bu...@llnl.gov>.
This seems very straight-forward to me. My approach may not be the most
efficient, but I would
1) write a wrapper around the ConceptMapper code so that you only pass
it spans of text that you would find useful.
2) write a post processing filter that throws away any tag that occurs
in a region of the text that you think is inappropriate (e.g. if you do
not want to tag a verb)
All of this would most easily be put into a single processing component
so you don't have unwanted annotations in your CAS
Dave
Ahmed Abdeen Hamed wrote:
> Hello,
> I have a quick question about the ConceptMapper project. How can I apply
> dictionary terms to a certain part of a document? For example, if you have
> documents that have titles and abstracts and you need only to find terms
> that appear in the abstract not the title, how do you do that? Also, if you
> would like to apply a filter such as detecting a certain POS like names vs
> verbs. How would you approach this problem? Are there examples that I can
> take a look at? Please let me know if you have an answer for me.
> Thanks in advance!
> Ahmed
>
>