You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ctakes.apache.org by "Cox, Kevin (US - Raleigh)" <ke...@deloitte.com> on 2018/10/12 17:24:23 UTC

UMLS dictionary lookup issues

Hi:

I've installed CTAKES 4.0.0 along with the additional resources package.  Further, my UMLS account has been setup to be used with the CTAKES installation.

However, when I use CVD to run the AggregatePlaintextFastUMLSProcessor engine, it's not able to identify the most specific UMLS concept mapping.

For example, when I search for "influenza vaccine"...only the term "vaccine" is mapped to the UMLS medication concept with ID C0042210 (Vaccines).

However, doing a similar search for "flu vaccine"...it's able to find the more specific UMLS medication concept with ID C0021403 (Influenza virus vaccine).

Looking at the UMLS Metathesaurus, I see all the necessary synonyms that should make this concept lookup possible.  So, I'm wondering why CTAKES is not able to map to the more specific UMLS concepts.  I've looked at the CTAKES online demos, and they have the same issue...so, I don't think this is an issue with my local setup.

Wondering if others have found a solution to this...  I appreciate any guidance you can provide.

Thanks,
Kevin Cox
Product Architect | Deloitte Platforms
Deloitte Consulting
Raleigh, NC
Tel/Direct: (919) 546-8079 | Fax: (855) 534-6906 | Mobile: (919) 604-5404
kevicox@deloitte.com | www.deloitte.com


This message (including any attachments) contains confidential information intended for a specific individual and purpose, and is protected by law. If you are not the intended recipient, you should delete this message and any disclosure, copying, or distribution of this message, or the taking of any action based on it, by you is strictly prohibited.

v.E.1

RE: UMLS dictionary lookup issues

Posted by "Cox, Kevin (US - Raleigh)" <ke...@deloitte.com>.
Right – clearly “flu” is a synonym for “influenza” in the SNOMED dictionary.

Even the SNOMED base term itself is “Influenza virus vaccine”…and even that term in CTAKES doesn’t find the match.

Is it possible the SNOMED dictionary shipped with CTAKES just doesn’t include all the synonyms from the UMLS?

Thanks,
Kevin Cox
Tel/Direct: (919) 546-8079 | Mobile: (919) 604-5404
kevicox@deloitte.com

From: Schenk, Gundolf <Gu...@ucsf.edu>
Sent: Monday, October 15, 2018 2:02 PM
To: user@ctakes.apache.org
Subject: [EXT] Re: UMLS dictionary lookup issues

Hi,

I am seeing the same behavior. But I am not sure if it is really a dictionary issue.

Look here: http://browser.ihtsdotools.org/?perspective=full&conceptId1=404684003&edition=en-edition&release=v20180131&server=http://browser.ihtsdotools.org/api/v1/snomed&langRefset=900000000000509007
When I search for “influenza vaccine” it finds 168 matches including the correct the concepts: “substance”, “product”, “situation” and “disorder” in the top 10.
Interestingly, a search for “flu vaccine” finds 3 matches only: “situation”, “product” and “substance”

I wonder what the exact “chunk” is, that is used for the look-up. Any guidance on where to look in the CAS (or the display thereof in the CVD) for the query string?

Cheers,
Gundolf.



From: Ory Henn <or...@trialjectory.com>>
Reply-To: "user@ctakes.apache.org<ma...@ctakes.apache.org>" <us...@ctakes.apache.org>>
Date: Monday, October 15, 2018 at 10:20
To: "user@ctakes.apache.org<ma...@ctakes.apache.org>" <us...@ctakes.apache.org>>
Subject: Re: UMLS dictionary lookup issues

Kevin,
This looks indeed like a dictionary issue. You can take a look at the thread started by me a couple of months ago [Titled "How do I add a dictionary (like NCI) to cTakes lookup?"] which has more links and resources as well as workarounds for some issues you may encounter.
Good luck
Ory

On Mon, Oct 15, 2018 at 7:57 PM nishant kumar <ku...@gmail.com>> wrote:
Hi Kevin,

It seems you are facing an issue related to the dictionary your cTAKES is mapped to.  The cTAKES installation by default is mapped to the RxNORM and SNOMED CT dictionaries. If you wish to capture concepts which are more specific to another ontology then you need to build the corresponding dictionary and point it to your cTAKES installation. This can be done by using the dictionary creator tool. Here is some info on how to build a dictionary:  https://cwiki.apache.org/confluence/display/CTAKES/Dictionary+Creator+GUI

Hope it helps.

thanks
Nishant


On Fri, Oct 12, 2018 at 10:54 PM Cox, Kevin (US - Raleigh) <ke...@deloitte.com>> wrote:
Hi:

I’ve installed CTAKES 4.0.0 along with the additional resources package.  Further, my UMLS account has been setup to be used with the CTAKES installation.

However, when I use CVD to run the AggregatePlaintextFastUMLSProcessor engine, it’s not able to identify the most specific UMLS concept mapping.

For example, when I search for “influenza vaccine”…only the term “vaccine” is mapped to the UMLS medication concept with ID C0042210 (Vaccines).

However, doing a similar search for “flu vaccine”…it’s able to find the more specific UMLS medication concept with ID C0021403 (Influenza virus vaccine).

Looking at the UMLS Metathesaurus, I see all the necessary synonyms that should make this concept lookup possible.  So, I’m wondering why CTAKES is not able to map to the more specific UMLS concepts.  I’ve looked at the CTAKES online demos, and they have the same issue…so, I don’t think this is an issue with my local setup.

Wondering if others have found a solution to this…  I appreciate any guidance you can provide.

Thanks,
Kevin Cox
Product Architect | Deloitte Platforms
Deloitte Consulting
Raleigh, NC
Tel/Direct: (919) 546-8079 | Fax: (855) 534-6906 | Mobile: (919) 604-5404
kevicox@deloitte.com<ma...@deloitte.com> | www.deloitte.com<http://www.deloitte.com>


This message (including any attachments) contains confidential information intended for a specific individual and purpose, and is protected by law. If you are not the intended recipient, you should delete this message and any disclosure, copying, or distribution of this message, or the taking of any action based on it, by you is strictly prohibited.

v.E.1


--
I'm going to have to sleep my way to the top! Wake me when I'm there.
H. Simpson

[https://res.cloudinary.com/dyswk0o6o/image/upload/v1525083591/logo4email_smaller_f5pqgv.png]
<https://trialmatch.me>www.trialjectory.com<https://trialjectory.com>

Re: UMLS dictionary lookup issues

Posted by "Schenk, Gundolf" <Gu...@ucsf.edu>.
Hi,

I am seeing the same behavior. But I am not sure if it is really a dictionary issue.

Look here: http://browser.ihtsdotools.org/?perspective=full&conceptId1=404684003&edition=en-edition&release=v20180131&server=http://browser.ihtsdotools.org/api/v1/snomed&langRefset=900000000000509007
When I search for “influenza vaccine” it finds 168 matches including the correct the concepts: “substance”, “product”, “situation” and “disorder” in the top 10.
Interestingly, a search for “flu vaccine” finds 3 matches only: “situation”, “product” and “substance”

I wonder what the exact “chunk” is, that is used for the look-up. Any guidance on where to look in the CAS (or the display thereof in the CVD) for the query string?

Cheers,
Gundolf.



From: Ory Henn <or...@trialjectory.com>
Reply-To: "user@ctakes.apache.org" <us...@ctakes.apache.org>
Date: Monday, October 15, 2018 at 10:20
To: "user@ctakes.apache.org" <us...@ctakes.apache.org>
Subject: Re: UMLS dictionary lookup issues

Kevin,
This looks indeed like a dictionary issue. You can take a look at the thread started by me a couple of months ago [Titled "How do I add a dictionary (like NCI) to cTakes lookup?"] which has more links and resources as well as workarounds for some issues you may encounter.
Good luck
Ory

On Mon, Oct 15, 2018 at 7:57 PM nishant kumar <ku...@gmail.com>> wrote:
Hi Kevin,

It seems you are facing an issue related to the dictionary your cTAKES is mapped to.  The cTAKES installation by default is mapped to the RxNORM and SNOMED CT dictionaries. If you wish to capture concepts which are more specific to another ontology then you need to build the corresponding dictionary and point it to your cTAKES installation. This can be done by using the dictionary creator tool. Here is some info on how to build a dictionary:  https://cwiki.apache.org/confluence/display/CTAKES/Dictionary+Creator+GUI

Hope it helps.

thanks
Nishant


On Fri, Oct 12, 2018 at 10:54 PM Cox, Kevin (US - Raleigh) <ke...@deloitte.com>> wrote:
Hi:

I’ve installed CTAKES 4.0.0 along with the additional resources package.  Further, my UMLS account has been setup to be used with the CTAKES installation.

However, when I use CVD to run the AggregatePlaintextFastUMLSProcessor engine, it’s not able to identify the most specific UMLS concept mapping.

For example, when I search for “influenza vaccine”…only the term “vaccine” is mapped to the UMLS medication concept with ID C0042210 (Vaccines).

However, doing a similar search for “flu vaccine”…it’s able to find the more specific UMLS medication concept with ID C0021403 (Influenza virus vaccine).

Looking at the UMLS Metathesaurus, I see all the necessary synonyms that should make this concept lookup possible.  So, I’m wondering why CTAKES is not able to map to the more specific UMLS concepts.  I’ve looked at the CTAKES online demos, and they have the same issue…so, I don’t think this is an issue with my local setup.

Wondering if others have found a solution to this…  I appreciate any guidance you can provide.

Thanks,
Kevin Cox
Product Architect | Deloitte Platforms
Deloitte Consulting
Raleigh, NC
Tel/Direct: (919) 546-8079 | Fax: (855) 534-6906 | Mobile: (919) 604-5404
kevicox@deloitte.com<ma...@deloitte.com> | www.deloitte.com<http://www.deloitte.com>


This message (including any attachments) contains confidential information intended for a specific individual and purpose, and is protected by law. If you are not the intended recipient, you should delete this message and any disclosure, copying, or distribution of this message, or the taking of any action based on it, by you is strictly prohibited.

v.E.1


--
I'm going to have to sleep my way to the top! Wake me when I'm there.
H. Simpson

[https://res.cloudinary.com/dyswk0o6o/image/upload/v1525083591/logo4email_smaller_f5pqgv.png]
<https://trialmatch.me>www.trialjectory.com<https://trialjectory.com>

Re: UMLS dictionary lookup issues

Posted by Ory Henn <or...@trialjectory.com>.
Kevin,
This looks indeed like a dictionary issue. You can take a look at the
thread started by me a couple of months ago [Titled "How do I add a
dictionary (like NCI) to cTakes lookup?"] which has more links and
resources as well as workarounds for some issues you may encounter.
Good luck
Ory

On Mon, Oct 15, 2018 at 7:57 PM nishant kumar <ku...@gmail.com>
wrote:

> Hi Kevin,
>
> It seems you are facing an issue related to the dictionary your cTAKES is
> mapped to.  The cTAKES installation by default is mapped to the RxNORM and
> SNOMED CT dictionaries. If you wish to capture concepts which are more
> specific to another ontology then you need to build the corresponding
> dictionary and point it to your cTAKES installation. This can be done by
> using the dictionary creator tool. Here is some info on how to build a
> dictionary:
> https://cwiki.apache.org/confluence/display/CTAKES/Dictionary+Creator+GUI
>
> Hope it helps.
>
> thanks
> Nishant
>
>
> On Fri, Oct 12, 2018 at 10:54 PM Cox, Kevin (US - Raleigh) <
> kevicox@deloitte.com> wrote:
>
>> Hi:
>>
>>
>>
>> I’ve installed CTAKES 4.0.0 along with the additional resources package.
>> Further, my UMLS account has been setup to be used with the CTAKES
>> installation.
>>
>>
>>
>> However, when I use CVD to run the AggregatePlaintextFastUMLSProcessor
>> engine, it’s not able to identify the most specific UMLS concept mapping.
>>
>>
>>
>> For example, when I search for “influenza vaccine”…only the term
>> “vaccine” is mapped to the UMLS medication concept with ID C0042210
>> (Vaccines).
>>
>>
>>
>> However, doing a similar search for “flu vaccine”…it’s able to find the
>> more specific UMLS medication concept with ID C0021403 (Influenza virus
>> vaccine).
>>
>>
>>
>> Looking at the UMLS Metathesaurus, I see all the necessary synonyms that
>> should make this concept lookup possible.  So, I’m wondering why CTAKES is
>> not able to map to the more specific UMLS concepts.  I’ve looked at the
>> CTAKES online demos, and they have the same issue…so, I don’t think this is
>> an issue with my local setup.
>>
>>
>>
>> Wondering if others have found a solution to this…  I appreciate any
>> guidance you can provide.
>>
>>
>>
>> Thanks,
>>
>> *Kevin Cox*
>>
>> Product Architect | Deloitte Platforms
>>
>> Deloitte Consulting
>>
>> Raleigh, NC
>>
>> Tel/Direct: (919) 546-8079 | Fax: (855) 534-6906 | Mobile: (919) 604-5404
>>
>> kevicox@deloitte.com | www.deloitte.com
>>
>>
>>
>> This message (including any attachments) contains confidential
>> information intended for a specific individual and purpose, and is
>> protected by law. If you are not the intended recipient, you should delete
>> this message and any disclosure, copying, or distribution of this message,
>> or the taking of any action based on it, by you is strictly prohibited.
>>
>> v.E.1
>>
>

-- 
I'm going to have to sleep my way to the top! Wake me when I'm there.
H. Simpson

-- 

 <https://trialmatch.me>www.trialjectory.com <https://trialjectory.com>

Re: UMLS dictionary lookup issues

Posted by nishant kumar <ku...@gmail.com>.
Hi Kevin,

It seems you are facing an issue related to the dictionary your cTAKES is
mapped to.  The cTAKES installation by default is mapped to the RxNORM and
SNOMED CT dictionaries. If you wish to capture concepts which are more
specific to another ontology then you need to build the corresponding
dictionary and point it to your cTAKES installation. This can be done by
using the dictionary creator tool. Here is some info on how to build a
dictionary:
https://cwiki.apache.org/confluence/display/CTAKES/Dictionary+Creator+GUI

Hope it helps.

thanks
Nishant


On Fri, Oct 12, 2018 at 10:54 PM Cox, Kevin (US - Raleigh) <
kevicox@deloitte.com> wrote:

> Hi:
>
>
>
> I’ve installed CTAKES 4.0.0 along with the additional resources package.
> Further, my UMLS account has been setup to be used with the CTAKES
> installation.
>
>
>
> However, when I use CVD to run the AggregatePlaintextFastUMLSProcessor
> engine, it’s not able to identify the most specific UMLS concept mapping.
>
>
>
> For example, when I search for “influenza vaccine”…only the term “vaccine”
> is mapped to the UMLS medication concept with ID C0042210 (Vaccines).
>
>
>
> However, doing a similar search for “flu vaccine”…it’s able to find the
> more specific UMLS medication concept with ID C0021403 (Influenza virus
> vaccine).
>
>
>
> Looking at the UMLS Metathesaurus, I see all the necessary synonyms that
> should make this concept lookup possible.  So, I’m wondering why CTAKES is
> not able to map to the more specific UMLS concepts.  I’ve looked at the
> CTAKES online demos, and they have the same issue…so, I don’t think this is
> an issue with my local setup.
>
>
>
> Wondering if others have found a solution to this…  I appreciate any
> guidance you can provide.
>
>
>
> Thanks,
>
> *Kevin Cox*
>
> Product Architect | Deloitte Platforms
>
> Deloitte Consulting
>
> Raleigh, NC
>
> Tel/Direct: (919) 546-8079 | Fax: (855) 534-6906 | Mobile: (919) 604-5404
>
> kevicox@deloitte.com | www.deloitte.com
>
>
>
> This message (including any attachments) contains confidential information
> intended for a specific individual and purpose, and is protected by law. If
> you are not the intended recipient, you should delete this message and any
> disclosure, copying, or distribution of this message, or the taking of any
> action based on it, by you is strictly prohibited.
>
> v.E.1
>