You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ctakes.apache.org by ravi garg <ra...@gmail.com> on 2013/04/22 21:05:28 UTC

Regarding Entity Recognition

Hey,
First of all Congrats for building such a wonderful software. I am very new
to cTAKES so had a very basic question to ask.
My query is Is it possible to identify multiple words as a single entity,
for eg right now knee pain gets identified as 'knee' and 'pain', but is it
possible to get 'knee pain' as single identity. If so what all changes I
have to make to get going.


-- 
Ravi Garg
3rd Year
MSc (hons) Biological Sciences
B.E (hons) Computer Science and Engineering
BITS Pilani KK Birla Goa Campus

Re: Regarding Entity Recognition

Posted by ravi garg <ra...@gmail.com>.
Hey,
Thank you that worked. :)


On Wed, Apr 24, 2013 at 6:56 PM, Chen, Pei
<Pe...@childrens.harvard.edu>wrote:

> Hi Ravi,
> In the attached LookupDesc_csv_sample.xml, It looks like it is still
> configured to use DirectLookupInitializerImpl...
>
> <dictionaryRef idRef="DICT_CSV_SAMPLE" />
> <lookupInitializer
> className="org.apache.ctakes.dictionary.lookup.ae.DirectLookupInitializerImpl">
> </lookupInitializer>
>
> Try:
> <lookupInitializer
> className="org.apache.ctakes.dictionary.lookup.ae.FirstTokenPermLookupInitializerImpl">
> <properties>
> <property key="textMetaFields" value="0|1"/>
> <property key="maxPermutationLevel" value="7"/>
> <property key="windowAnnotations"
> value="org.apache.ctakes.typesystem.type.textspan.LookupWindowAnnotation"/>
> </properties>
> </lookupInitializer>
> Instead.
>
> > I also wanted to know if this is the only method to use a non-UMLS
> vocabulary as a dictionary in cTAKES
> It should be configurable to use CSV, Lucene, MySQL/HSQLDB, etc. where one
> can insert their custom vocabulary, and they and also implement custom
> lookup algorithms if needed.
> --Pei
>
> From: ravi garg [mailto:ravigarg27@gmail.com]
> Sent: Wednesday, April 24, 2013 8:29 AM
> To: user@ctakes.apache.org
> Subject: Re: Regarding Entity Recognition
>
> Hey Sorry for delayed reply.
> I believe the changes you are suggesting, are to be made in
> LookupDesc_csv_sample.xml. I made those changes but still didn't get the
> required results.
> I am attaching the files here for reference.
> I also wanted to know if this is the only method to use a non-UMLS
> vocabulary as a dictionary in cTAKES
> Regards,
> Ravi Garg
>
> On Tue, Apr 23, 2013 at 1:40 AM, Chen, Pei <Pe...@childrens.harvard.edu>
> wrote:
> Ravi,
> Could you please attach the DictionaryLookupAnnotarCSV.xml
> In particular, please consider using the
> FirstTokenPermLookupInitializerImpl vs DirectLookup.
>
> <lookupInitializer
> className="org.apache.ctakes.dictionary.lookup.ae.FirstTokenPermLookupInitializerImpl">
> <properties>
> <property key="textMetaFields" value="0|1"/>
> <property key="maxPermutationLevel" value="7"/>
> <property key="windowAnnotations"
> value="org.apache.ctakes.typesystem.type.textspan.LookupWindowAnnotation"/>
>
> </properties>
> </lookupInitializer>
>
> I hope that helps.
>
> From: ravi garg [mailto:ravigarg27@gmail.com]
> Sent: Monday, April 22, 2013 4:09 PM
>
> To: user@ctakes.apache.org
> Subject: Re: Regarding Entity Recognition
>
> Sorry, But this too doesn't solve the problem
>
> On Tue, Apr 23, 2013 at 1:28 AM, Savova, Guergana <
> Guergana.Savova@childrens.harvard.edu> wrote:
> Try adding in the dictionary:
> Knee|knee pain|..
>
> The first field is reserved for the first word of the phrase.
> Regards,
> --Guergana
>
> From: ravi garg [mailto:ravigarg27@gmail.com]
> Sent: Monday, April 22, 2013 3:37 PM
> To: user@ctakes.apache.org
> Subject: Re: Regarding Entity Recognition
>
> Hey,
> Thanks for reply.
> First let me brief you on what configuration I am using. I am using
> AggregatePlaintextProcessor.xml with DictionaryLookupAnnotar being
> DictionaryLookupAnnotarCSV.xml which reads dictionary from two files i.e
> one being the flat dictionary1.csv and another the lucene index one. I have
> added knee pain as single term in dictionary1.csv (like knee pain| knee
> pain) but still I am not being to get them as single entity. Am I missing
> something here?
> Regards,
> Ravi Garg
>
> On Tue, Apr 23, 2013 at 12:49 AM, Chen, Pei <
> Pei.Chen@childrens.harvard.edu> wrote:
> Hi Ravi,
> Yes, in your example "knee pain", the default behavior in the dictionary
> lookup will create 3 IdentifiedAnnotations
> "knee", "pain", as well as "knee pain".
>
> [Assuming the terms exist in the UMLS dictionary]
> --Pei
>
> From: ravi garg [mailto:ravigarg27@gmail.com]
> Sent: Monday, April 22, 2013 3:06 PM
> To: user@ctakes.apache.org
> Subject: Regarding Entity Recognition
>
> Hey,
> First of all Congrats for building such a wonderful software. I am very
> new to cTAKES so had a very basic question to ask.
> My query is Is it possible to identify multiple words as a single entity,
> for eg right now knee pain gets identified as 'knee' and 'pain', but is it
> possible to get 'knee pain' as single identity. If so what all changes I
> have to make to get going.
>
>
>
> --
> Ravi Garg
> 3rd Year
> MSc (hons) Biological Sciences
> B.E (hons) Computer Science and Engineering
> BITS Pilani KK Birla Goa Campus
>
>
>
> --
> Ravi Garg
> 3rd Year
> MSc (hons) Biological Sciences
> B.E (hons) Computer Science and Engineering
> BITS Pilani KK Birla Goa Campus
>
>
>
> --
> Ravi Garg
> 3rd Year
> MSc (hons) Biological Sciences
> B.E (hons) Computer Science and Engineering
> BITS Pilani KK Birla Goa Campus
>
>
>
> --
> Ravi Garg
> 3rd Year
> MSc (hons) Biological Sciences
> B.E (hons) Computer Science and Engineering
> BITS Pilani KK Birla Goa Campus
>



-- 
Ravi Garg
3rd Year
MSc (hons) Biological Sciences
B.E (hons) Computer Science and Engineering
BITS Pilani KK Birla Goa Campus

RE: Regarding Entity Recognition

Posted by "Chen, Pei" <Pe...@childrens.harvard.edu>.
Hi Ravi,
In the attached LookupDesc_csv_sample.xml, It looks like it is still configured to use DirectLookupInitializerImpl...

<dictionaryRef idRef="DICT_CSV_SAMPLE" /> 
<lookupInitializer className="org.apache.ctakes.dictionary.lookup.ae.DirectLookupInitializerImpl">
</lookupInitializer>

Try:
<lookupInitializer className="org.apache.ctakes.dictionary.lookup.ae.FirstTokenPermLookupInitializerImpl">
<properties>
<property key="textMetaFields" value="0|1"/>
<property key="maxPermutationLevel" value="7"/>
<property key="windowAnnotations" value="org.apache.ctakes.typesystem.type.textspan.LookupWindowAnnotation"/>                                                                    
</properties>
</lookupInitializer>
Instead.

> I also wanted to know if this is the only method to use a non-UMLS vocabulary as a dictionary in cTAKES
It should be configurable to use CSV, Lucene, MySQL/HSQLDB, etc. where one can insert their custom vocabulary, and they and also implement custom lookup algorithms if needed.
--Pei

From: ravi garg [mailto:ravigarg27@gmail.com] 
Sent: Wednesday, April 24, 2013 8:29 AM
To: user@ctakes.apache.org
Subject: Re: Regarding Entity Recognition

Hey Sorry for delayed reply.
I believe the changes you are suggesting, are to be made in LookupDesc_csv_sample.xml. I made those changes but still didn't get the required results.
I am attaching the files here for reference. 
I also wanted to know if this is the only method to use a non-UMLS vocabulary as a dictionary in cTAKES
Regards,
Ravi Garg 

On Tue, Apr 23, 2013 at 1:40 AM, Chen, Pei <Pe...@childrens.harvard.edu> wrote:
Ravi, 
Could you please attach the DictionaryLookupAnnotarCSV.xml
In particular, please consider using the FirstTokenPermLookupInitializerImpl vs DirectLookup.
 
<lookupInitializer className="org.apache.ctakes.dictionary.lookup.ae.FirstTokenPermLookupInitializerImpl">
<properties>
<property key="textMetaFields" value="0|1"/>
<property key="maxPermutationLevel" value="7"/>
<property key="windowAnnotations" value="org.apache.ctakes.typesystem.type.textspan.LookupWindowAnnotation"/>                                                                    
</properties>
</lookupInitializer>
 
I hope that helps.
 
From: ravi garg [mailto:ravigarg27@gmail.com] 
Sent: Monday, April 22, 2013 4:09 PM

To: user@ctakes.apache.org
Subject: Re: Regarding Entity Recognition
 
Sorry, But this too doesn't solve the problem
 
On Tue, Apr 23, 2013 at 1:28 AM, Savova, Guergana <Gu...@childrens.harvard.edu> wrote:
Try adding in the dictionary:
Knee|knee pain|..
 
The first field is reserved for the first word of the phrase.
Regards,
--Guergana
 
From: ravi garg [mailto:ravigarg27@gmail.com] 
Sent: Monday, April 22, 2013 3:37 PM
To: user@ctakes.apache.org
Subject: Re: Regarding Entity Recognition
 
Hey,
Thanks for reply.
First let me brief you on what configuration I am using. I am using AggregatePlaintextProcessor.xml with DictionaryLookupAnnotar being DictionaryLookupAnnotarCSV.xml which reads dictionary from two files i.e one being the flat dictionary1.csv and another the lucene index one. I have added knee pain as single term in dictionary1.csv (like knee pain| knee pain) but still I am not being to get them as single entity. Am I missing something here?
Regards,
Ravi Garg
 
On Tue, Apr 23, 2013 at 12:49 AM, Chen, Pei <Pe...@childrens.harvard.edu> wrote:
Hi Ravi,
Yes, in your example "knee pain", the default behavior in the dictionary lookup will create 3 IdentifiedAnnotations
"knee", "pain", as well as "knee pain".
 
[Assuming the terms exist in the UMLS dictionary]
--Pei
 
From: ravi garg [mailto:ravigarg27@gmail.com] 
Sent: Monday, April 22, 2013 3:06 PM
To: user@ctakes.apache.org
Subject: Regarding Entity Recognition
 
Hey,
First of all Congrats for building such a wonderful software. I am very new to cTAKES so had a very basic question to ask. 
My query is Is it possible to identify multiple words as a single entity, for eg right now knee pain gets identified as 'knee' and 'pain', but is it possible to get 'knee pain' as single identity. If so what all changes I have to make to get going.



-- 
Ravi Garg
3rd Year
MSc (hons) Biological Sciences
B.E (hons) Computer Science and Engineering
BITS Pilani KK Birla Goa Campus



-- 
Ravi Garg
3rd Year
MSc (hons) Biological Sciences
B.E (hons) Computer Science and Engineering
BITS Pilani KK Birla Goa Campus



-- 
Ravi Garg
3rd Year
MSc (hons) Biological Sciences
B.E (hons) Computer Science and Engineering
BITS Pilani KK Birla Goa Campus



-- 
Ravi Garg
3rd Year
MSc (hons) Biological Sciences
B.E (hons) Computer Science and Engineering
BITS Pilani KK Birla Goa Campus

Re: Regarding Entity Recognition

Posted by ravi garg <ra...@gmail.com>.
Hey Sorry for delayed reply.
I believe the changes you are suggesting, are to be made in
LookupDesc_csv_sample.xml. I made those changes but still didn't get the
required results.

I am attaching the files here for reference.

I also wanted to know if this is the only method to use a non-UMLS
vocabulary as a dictionary in cTAKES

Regards,
Ravi Garg


On Tue, Apr 23, 2013 at 1:40 AM, Chen, Pei
<Pe...@childrens.harvard.edu>wrote:

>  Ravi, ****
>
> Could you please attach the DictionaryLookupAnnotarCSV.xml****
>
> In particular, please consider using the
> FirstTokenPermLookupInitializerImpl vs DirectLookup�****
>
> ** **
>
> <lookupInitializer
> className="org.apache.ctakes.dictionary.lookup.ae.FirstTokenPermLookupInitializerImpl">
> ****
>
> <properties>****
>
> <property key="textMetaFields" value="0|1"/>****
>
> <property key="maxPermutationLevel" value="7"/>****
>
> <property key="windowAnnotations"
> value="org.apache.ctakes.typesystem.type.textspan.LookupWindowAnnotation"/>
>     ****
>
> </properties>****
>
> </lookupInitializer>****
>
> ** **
>
> I hope that helps.****
>
> ** **
>
> *From:* ravi garg [mailto:ravigarg27@gmail.com]
> *Sent:* Monday, April 22, 2013 4:09 PM
>
> *To:* user@ctakes.apache.org
> *Subject:* Re: Regarding Entity Recognition****
>
>  ** **
>
> Sorry, But this too doesn't solve the problem****
>
> ** **
>
> On Tue, Apr 23, 2013 at 1:28 AM, Savova, Guergana <
> Guergana.Savova@childrens.harvard.edu> wrote:****
>
> Try adding in the dictionary:****
>
> Knee|knee pain|�.****
>
>  ****
>
> The first field is reserved for the first word of the phrase.****
>
> Regards,****
>
> --Guergana****
>
>  ****
>
> *From:* ravi garg [mailto:ravigarg27@gmail.com]
> *Sent:* Monday, April 22, 2013 3:37 PM
> *To:* user@ctakes.apache.org
> *Subject:* Re: Regarding Entity Recognition****
>
>  ****
>
> Hey,****
>
> Thanks for reply.****
>
> First let me brief you on what configuration I am using. I am using
> AggregatePlaintextProcessor.xml with DictionaryLookupAnnotar being
> DictionaryLookupAnnotarCSV.xml which reads dictionary from two files i.e
> one being the flat dictionary1.csv and another the lucene index one. I have
> added knee pain as single term in dictionary1.csv (like knee pain| knee
> pain) but still I am not being to get them as single entity. Am I missing
> something here?****
>
> Regards,****
>
> Ravi Garg****
>
>  ****
>
> On Tue, Apr 23, 2013 at 12:49 AM, Chen, Pei <
> Pei.Chen@childrens.harvard.edu> wrote:****
>
> Hi Ravi,****
>
> Yes, in your example �knee pain�, the default behavior in the dictionary
> lookup will create 3 IdentifiedAnnotations****
>
> �knee�, �pain�, as well as �knee pain�.****
>
>  ****
>
> [Assuming the terms exist in the UMLS dictionary]****
>
> --Pei****
>
>  ****
>
> *From:* ravi garg [mailto:ravigarg27@gmail.com]
> *Sent:* Monday, April 22, 2013 3:06 PM
> *To:* user@ctakes.apache.org
> *Subject:* Regarding Entity Recognition****
>
>  ****
>
> Hey,****
>
> First of all Congrats for building such a wonderful software. I am very
> new to cTAKES so had a very basic question to ask. ****
>
> My query is Is it possible to identify multiple words as a single entity,
> for eg right now knee pain gets identified as 'knee' and 'pain', but is it
> possible to get 'knee pain' as single identity. If so what all changes I
> have to make to get going.****
>
>
> ****
>
>
> --
> Ravi Garg
> 3rd Year
> MSc (hons) Biological Sciences
> B.E (hons) Computer Science and Engineering
> BITS Pilani KK Birla Goa Campus****
>
>
>
>
> --
> Ravi Garg
> 3rd Year
> MSc (hons) Biological Sciences
> B.E (hons) Computer Science and Engineering
> BITS Pilani KK Birla Goa Campus****
>
>
>
>
> --
> Ravi Garg
> 3rd Year
> MSc (hons) Biological Sciences
> B.E (hons) Computer Science and Engineering
> BITS Pilani KK Birla Goa Campus****
>



-- 
Ravi Garg
3rd Year
MSc (hons) Biological Sciences
B.E (hons) Computer Science and Engineering
BITS Pilani KK Birla Goa Campus

RE: Regarding Entity Recognition

Posted by "Chen, Pei" <Pe...@childrens.harvard.edu>.
Ravi,
Could you please attach the DictionaryLookupAnnotarCSV.xml
In particular, please consider using the FirstTokenPermLookupInitializerImpl vs DirectLookup...

<lookupInitializer className="org.apache.ctakes.dictionary.lookup.ae.FirstTokenPermLookupInitializerImpl">
<properties>
<property key="textMetaFields" value="0|1"/>
<property key="maxPermutationLevel" value="7"/>
<property key="windowAnnotations" value="org.apache.ctakes.typesystem.type.textspan.LookupWindowAnnotation"/>
</properties>
</lookupInitializer>

I hope that helps.

From: ravi garg [mailto:ravigarg27@gmail.com]
Sent: Monday, April 22, 2013 4:09 PM
To: user@ctakes.apache.org
Subject: Re: Regarding Entity Recognition

Sorry, But this too doesn't solve the problem

On Tue, Apr 23, 2013 at 1:28 AM, Savova, Guergana <Gu...@childrens.harvard.edu>> wrote:
Try adding in the dictionary:
Knee|knee pain|....

The first field is reserved for the first word of the phrase.
Regards,
--Guergana

From: ravi garg [mailto:ravigarg27@gmail.com<ma...@gmail.com>]
Sent: Monday, April 22, 2013 3:37 PM
To: user@ctakes.apache.org<ma...@ctakes.apache.org>
Subject: Re: Regarding Entity Recognition

Hey,
Thanks for reply.
First let me brief you on what configuration I am using. I am using AggregatePlaintextProcessor.xml with DictionaryLookupAnnotar being DictionaryLookupAnnotarCSV.xml which reads dictionary from two files i.e one being the flat dictionary1.csv and another the lucene index one. I have added knee pain as single term in dictionary1.csv (like knee pain| knee pain) but still I am not being to get them as single entity. Am I missing something here?
Regards,
Ravi Garg

On Tue, Apr 23, 2013 at 12:49 AM, Chen, Pei <Pe...@childrens.harvard.edu>> wrote:
Hi Ravi,
Yes, in your example "knee pain", the default behavior in the dictionary lookup will create 3 IdentifiedAnnotations
"knee", "pain", as well as "knee pain".

[Assuming the terms exist in the UMLS dictionary]
--Pei

From: ravi garg [mailto:ravigarg27@gmail.com<ma...@gmail.com>]
Sent: Monday, April 22, 2013 3:06 PM
To: user@ctakes.apache.org<ma...@ctakes.apache.org>
Subject: Regarding Entity Recognition

Hey,
First of all Congrats for building such a wonderful software. I am very new to cTAKES so had a very basic question to ask.
My query is Is it possible to identify multiple words as a single entity, for eg right now knee pain gets identified as 'knee' and 'pain', but is it possible to get 'knee pain' as single identity. If so what all changes I have to make to get going.


--
Ravi Garg
3rd Year
MSc (hons) Biological Sciences
B.E (hons) Computer Science and Engineering
BITS Pilani KK Birla Goa Campus



--
Ravi Garg
3rd Year
MSc (hons) Biological Sciences
B.E (hons) Computer Science and Engineering
BITS Pilani KK Birla Goa Campus



--
Ravi Garg
3rd Year
MSc (hons) Biological Sciences
B.E (hons) Computer Science and Engineering
BITS Pilani KK Birla Goa Campus

Re: Regarding Entity Recognition

Posted by ravi garg <ra...@gmail.com>.
Sorry, But this too doesn't solve the problem


On Tue, Apr 23, 2013 at 1:28 AM, Savova, Guergana <
Guergana.Savova@childrens.harvard.edu> wrote:

>  Try adding in the dictionary:****
>
> Knee|knee pain|….****
>
> ** **
>
> The first field is reserved for the first word of the phrase.****
>
> Regards,****
>
> --Guergana****
>
> ** **
>
> *From:* ravi garg [mailto:ravigarg27@gmail.com]
> *Sent:* Monday, April 22, 2013 3:37 PM
> *To:* user@ctakes.apache.org
> *Subject:* Re: Regarding Entity Recognition****
>
> ** **
>
> Hey,****
>
> Thanks for reply.****
>
> First let me brief you on what configuration I am using. I am using
> AggregatePlaintextProcessor.xml with DictionaryLookupAnnotar being
> DictionaryLookupAnnotarCSV.xml which reads dictionary from two files i.e
> one being the flat dictionary1.csv and another the lucene index one. I have
> added knee pain as single term in dictionary1.csv (like knee pain| knee
> pain) but still I am not being to get them as single entity. Am I missing
> something here?****
>
> Regards,****
>
> Ravi Garg****
>
> ** **
>
> On Tue, Apr 23, 2013 at 12:49 AM, Chen, Pei <
> Pei.Chen@childrens.harvard.edu> wrote:****
>
> Hi Ravi,****
>
> Yes, in your example “knee pain”, the default behavior in the dictionary
> lookup will create 3 IdentifiedAnnotations****
>
> “knee”, “pain”, as well as “knee pain”.****
>
>  ****
>
> [Assuming the terms exist in the UMLS dictionary]****
>
> --Pei****
>
>  ****
>
> *From:* ravi garg [mailto:ravigarg27@gmail.com]
> *Sent:* Monday, April 22, 2013 3:06 PM
> *To:* user@ctakes.apache.org
> *Subject:* Regarding Entity Recognition****
>
>  ****
>
> Hey,****
>
> First of all Congrats for building such a wonderful software. I am very
> new to cTAKES so had a very basic question to ask. ****
>
> My query is Is it possible to identify multiple words as a single entity,
> for eg right now knee pain gets identified as 'knee' and 'pain', but is it
> possible to get 'knee pain' as single identity. If so what all changes I
> have to make to get going.****
>
>
> ****
>
>
> --
> Ravi Garg
> 3rd Year
> MSc (hons) Biological Sciences
> B.E (hons) Computer Science and Engineering
> BITS Pilani KK Birla Goa Campus****
>
>
>
>
> --
> Ravi Garg
> 3rd Year
> MSc (hons) Biological Sciences
> B.E (hons) Computer Science and Engineering
> BITS Pilani KK Birla Goa Campus****
>



-- 
Ravi Garg
3rd Year
MSc (hons) Biological Sciences
B.E (hons) Computer Science and Engineering
BITS Pilani KK Birla Goa Campus

RE: Regarding Entity Recognition

Posted by "Savova, Guergana" <Gu...@childrens.harvard.edu>.
Try adding in the dictionary:
Knee|knee pain|....

The first field is reserved for the first word of the phrase.
Regards,
--Guergana

From: ravi garg [mailto:ravigarg27@gmail.com]
Sent: Monday, April 22, 2013 3:37 PM
To: user@ctakes.apache.org
Subject: Re: Regarding Entity Recognition

Hey,
Thanks for reply.
First let me brief you on what configuration I am using. I am using AggregatePlaintextProcessor.xml with DictionaryLookupAnnotar being DictionaryLookupAnnotarCSV.xml which reads dictionary from two files i.e one being the flat dictionary1.csv and another the lucene index one. I have added knee pain as single term in dictionary1.csv (like knee pain| knee pain) but still I am not being to get them as single entity. Am I missing something here?
Regards,
Ravi Garg

On Tue, Apr 23, 2013 at 12:49 AM, Chen, Pei <Pe...@childrens.harvard.edu>> wrote:
Hi Ravi,
Yes, in your example "knee pain", the default behavior in the dictionary lookup will create 3 IdentifiedAnnotations
"knee", "pain", as well as "knee pain".

[Assuming the terms exist in the UMLS dictionary]
--Pei

From: ravi garg [mailto:ravigarg27@gmail.com<ma...@gmail.com>]
Sent: Monday, April 22, 2013 3:06 PM
To: user@ctakes.apache.org<ma...@ctakes.apache.org>
Subject: Regarding Entity Recognition

Hey,
First of all Congrats for building such a wonderful software. I am very new to cTAKES so had a very basic question to ask.
My query is Is it possible to identify multiple words as a single entity, for eg right now knee pain gets identified as 'knee' and 'pain', but is it possible to get 'knee pain' as single identity. If so what all changes I have to make to get going.


--
Ravi Garg
3rd Year
MSc (hons) Biological Sciences
B.E (hons) Computer Science and Engineering
BITS Pilani KK Birla Goa Campus



--
Ravi Garg
3rd Year
MSc (hons) Biological Sciences
B.E (hons) Computer Science and Engineering
BITS Pilani KK Birla Goa Campus

Re: Regarding Entity Recognition

Posted by ravi garg <ra...@gmail.com>.
Hey,
Thanks for reply.
First let me brief you on what configuration I am using. I am using
AggregatePlaintextProcessor.xml with DictionaryLookupAnnotar being
DictionaryLookupAnnotarCSV.xml which reads dictionary from two files i.e
one being the flat dictionary1.csv and another the lucene index one. I have
added knee pain as single term in dictionary1.csv (like knee pain| knee
pain) but still I am not being to get them as single entity. Am I missing
something here?

Regards,
Ravi Garg


On Tue, Apr 23, 2013 at 12:49 AM, Chen, Pei
<Pe...@childrens.harvard.edu>wrote:

>  Hi Ravi,****
>
> Yes, in your example “knee pain”, the default behavior in the dictionary
> lookup will create 3 IdentifiedAnnotations****
>
> “knee”, “pain”, as well as “knee pain”.****
>
> ** **
>
> [Assuming the terms exist in the UMLS dictionary]****
>
> --Pei****
>
> ** **
>
> *From:* ravi garg [mailto:ravigarg27@gmail.com]
> *Sent:* Monday, April 22, 2013 3:06 PM
> *To:* user@ctakes.apache.org
> *Subject:* Regarding Entity Recognition****
>
> ** **
>
> Hey,****
>
> First of all Congrats for building such a wonderful software. I am very
> new to cTAKES so had a very basic question to ask. ****
>
> My query is Is it possible to identify multiple words as a single entity,
> for eg right now knee pain gets identified as 'knee' and 'pain', but is it
> possible to get 'knee pain' as single identity. If so what all changes I
> have to make to get going.****
>
>
> ****
>
>
> --
> Ravi Garg
> 3rd Year
> MSc (hons) Biological Sciences
> B.E (hons) Computer Science and Engineering
> BITS Pilani KK Birla Goa Campus****
>



-- 
Ravi Garg
3rd Year
MSc (hons) Biological Sciences
B.E (hons) Computer Science and Engineering
BITS Pilani KK Birla Goa Campus

RE: Regarding Entity Recognition

Posted by "Chen, Pei" <Pe...@childrens.harvard.edu>.
Hi Ravi,
Yes, in your example "knee pain", the default behavior in the dictionary lookup will create 3 IdentifiedAnnotations
"knee", "pain", as well as "knee pain".

[Assuming the terms exist in the UMLS dictionary]
--Pei

From: ravi garg [mailto:ravigarg27@gmail.com]
Sent: Monday, April 22, 2013 3:06 PM
To: user@ctakes.apache.org
Subject: Regarding Entity Recognition

Hey,
First of all Congrats for building such a wonderful software. I am very new to cTAKES so had a very basic question to ask.
My query is Is it possible to identify multiple words as a single entity, for eg right now knee pain gets identified as 'knee' and 'pain', but is it possible to get 'knee pain' as single identity. If so what all changes I have to make to get going.


--
Ravi Garg
3rd Year
MSc (hons) Biological Sciences
B.E (hons) Computer Science and Engineering
BITS Pilani KK Birla Goa Campus