You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ctakes.apache.org by Bandeep Singh <bs...@phemi.com> on 2016/08/29 17:21:03 UTC

cTakes not parsing every text with SNOMED-CT

Hi,

I have been trying to parse a sample text using cTakes. However, cTakes
does not annotate few words with its UMLS/SNOEMED CT equivalent code.

Sample text: " The patient is suffering from extreme pain due to shark
bite"

*cTakes output: *
[image: Inline image 1]

If we observer in the above example, word "suffering" is not annotated with
SNOWMED-CT equivalent code *706873003.*

Can somebody explain why ? Is there any other customized SNOWMED/UMLS
dictionary I can use to annotate everything ?

Re: cTakes not parsing every text with SNOMED-CT

Posted by Bandeep Singh <bs...@phemi.com>.

Also,

Seems like i am using old UMLS dictionary (UMLS2011AB).
Can somebody share the link to latest UMLS dictionaries that will work with
cTakes and steps on "how to integrate it inside cTakes".

Thanks,
Bandeep

On Mon, Aug 29, 2016 at 1:28 PM, Bandeep Singh <bs...@phemi.com> wrote:

> Thanks for the reply savova.
> I tried changing the lookup window in *LookupDesc.xml *(under
> *resources/org/apache/ctakes/dictionary/lookup*) but could not get it to
> work.
> Is there any other config file i need a make a change to reflect the
> updates.
>
> Thanks,
> Bandeep
>
> On Mon, Aug 29, 2016 at 1:07 PM, Savova, Guergana <
> Guergana.Savova@childrens.harvard.edu> wrote:
>
>> You’d probably need to define the lookup window to include the verb
>> part-of-speech tags (they all start with a v*). The default window lookup
>> anchors around the n* phrases (noun phrases which are sequences of
>> part-of-speech tags). You have to modify the lookup setting.
>>
>> Hope this helps.
>>
>> --Guergana
>>
>>
>>
>> Guergana Savova, PhD, FACMI
>>
>> Associate Professor
>>
>> PI Natural Language Processing Lab
>>
>> Boston Children's Hospital and Harvard Medical School
>>
>> 300 Longwood Avenue
>>
>> Mailstop: BCH3092
>>
>> Enders 144.1
>>
>> Boston, MA 02115
>>
>> Harvard Scholar: http://scholar.harvard.edu/guergana_k_savova/biocv
>>
>>
>>
>> *From:* Bandeep Singh [mailto:bsingh@phemi.com]
>> *Sent:* Monday, August 29, 2016 1:21 PM
>> *To:* user@ctakes.apache.org
>> *Subject:* cTakes not parsing every text with SNOMED-CT
>>
>>
>>
>> Hi,
>>
>>
>>
>> I have been trying to parse a sample text using cTakes. However, cTakes
>> does not annotate few words with its UMLS/SNOEMED CT equivalent code.
>>
>>
>>
>> Sample text: " The patient is suffering from extreme pain due to shark
>> bite"
>>
>>
>>
>> *cTakes output: *
>>
>> [image: Inline image 1]
>>
>>
>>
>> If we observer in the above example, word "suffering" is not annotated
>> with SNOWMED-CT equivalent code *706873003.*
>>
>>
>>
>> Can somebody explain why ? Is there any other customized SNOWMED/UMLS
>> dictionary I can use to annotate everything ?
>>
>>
>>
>
>

Re: cTakes not parsing every text with SNOMED-CT

Posted by Bandeep Singh <bs...@phemi.com>.

Thanks for the reply savova.
I tried changing the lookup window in *LookupDesc.xml *(under
*resources/org/apache/ctakes/dictionary/lookup*) but could not get it to
work.
Is there any other config file i need a make a change to reflect the
updates.

Thanks,
Bandeep

On Mon, Aug 29, 2016 at 1:07 PM, Savova, Guergana <
Guergana.Savova@childrens.harvard.edu> wrote:

> You’d probably need to define the lookup window to include the verb
> part-of-speech tags (they all start with a v*). The default window lookup
> anchors around the n* phrases (noun phrases which are sequences of
> part-of-speech tags). You have to modify the lookup setting.
>
> Hope this helps.
>
> --Guergana
>
>
>
> Guergana Savova, PhD, FACMI
>
> Associate Professor
>
> PI Natural Language Processing Lab
>
> Boston Children's Hospital and Harvard Medical School
>
> 300 Longwood Avenue
>
> Mailstop: BCH3092
>
> Enders 144.1
>
> Boston, MA 02115
>
> Harvard Scholar: http://scholar.harvard.edu/guergana_k_savova/biocv
>
>
>
> *From:* Bandeep Singh [mailto:bsingh@phemi.com]
> *Sent:* Monday, August 29, 2016 1:21 PM
> *To:* user@ctakes.apache.org
> *Subject:* cTakes not parsing every text with SNOMED-CT
>
>
>
> Hi,
>
>
>
> I have been trying to parse a sample text using cTakes. However, cTakes
> does not annotate few words with its UMLS/SNOEMED CT equivalent code.
>
>
>
> Sample text: " The patient is suffering from extreme pain due to shark
> bite"
>
>
>
> *cTakes output: *
>
> [image: Inline image 1]
>
>
>
> If we observer in the above example, word "suffering" is not annotated
> with SNOWMED-CT equivalent code *706873003.*
>
>
>
> Can somebody explain why ? Is there any other customized SNOWMED/UMLS
> dictionary I can use to annotate everything ?
>
>
>

RE: cTakes not parsing every text with SNOMED-CT

Posted by "Savova, Guergana" <Gu...@childrens.harvard.edu>.

You’d probably need to define the lookup window to include the verb part-of-speech tags (they all start with a v*). The default window lookup anchors around the n* phrases (noun phrases which are sequences of part-of-speech tags). You have to modify the lookup setting.
Hope this helps.
--Guergana

Guergana Savova, PhD, FACMI
Associate Professor
PI Natural Language Processing Lab
Boston Children's Hospital and Harvard Medical School
300 Longwood Avenue
Mailstop: BCH3092
Enders 144.1
Boston, MA 02115
Harvard Scholar: http://scholar.harvard.edu/guergana_k_savova/biocv

From: Bandeep Singh [mailto:bsingh@phemi.com]
Sent: Monday, August 29, 2016 1:21 PM
To: user@ctakes.apache.org
Subject: cTakes not parsing every text with SNOMED-CT

Hi,

I have been trying to parse a sample text using cTakes. However, cTakes does not annotate few words with its UMLS/SNOEMED CT equivalent code.

Sample text: " The patient is suffering from extreme pain due to shark bite"

cTakes output:
[Inline image 1]

If we observer in the above example, word "suffering" is not annotated with SNOWMED-CT equivalent code 706873003.

Can somebody explain why ? Is there any other customized SNOWMED/UMLS dictionary I can use to annotate everything ?