You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ctakes.apache.org by EM Gladiators <em...@gmail.com> on 2014/09/21 23:53:21 UTC

ctakes not capturing all meds/rxnorm IDs

I have recently installed ctakes.  I am using the
AggregatePlaintextUMLSProcessor.  When running the CVD, it is not
accurately annotating all medication names.  Specific medication names that
it annotates correctly are metformin and nifedipine.  Specific medications
it fails to annotate are aspirin and simvastatin.  It is unclear why some
names would be annotated correctly and some not, even though all medication
names are found in rxnorm.  I am afraid that it is not correctly using the
rxnorm files.  Is there a way to check to see if the full rxnorm is being
searched?

I have not had a problem with snomed CT terms, ctakes does appear to be
capturing these correctly.

Thank you,

James Foster

My setup is as follows...

Mac OS 10.9.4
Java version 1.8.0_20
ctakes version 3.2.0 with optional resources package

Re: ctakes not capturing all meds/rxnorm IDs

Posted by EM Gladiators <em...@gmail.com>.
Very good, this seems to have solved my current issue.  With your suggested
change, it is now picking up all my medication names that I have tried so
far.  Snomed terms are also being captured correctly.  In my
LookupDesc_Db.xml I have 2 lookup consumer tags.  What is the difference
between these 2 sections (see below)?  I had to play around with various
options, but using the NamedEntityLookupConsumerImpl for only the fist
lookupConsumer tag works for me.  The pasted text is a snapshot of my
current LookupDesc_Db.xml

Sample text is here..
This is a 55 year old man who presents to the emergency department with
chest pain.  The pain is left sided and radiates to the back.  He describes
it as a pressure.  It started 1 hour prior to arrival.  The patient was
given aspirin by EMS and sublingual nitroglycerin.  At home the patient
takes simvastatin, colace and metformin for his type II diabetes.  He says
his blood sugar has been controlled.  He denies a history of coronary
artery disease and has never had a stress test or cardiac cath.

Im my current setup, it is detecting all medications mentioned in this
sample text.


<lookupConsumer
className="org.apache.ctakes.dictionary.lookup.ae.NamedEntityLookupConsumerImpl">
<properties>
<property key="codingScheme" value="SNOMED"/>
<property key="cuiMetaField" value="cui"/>
<property key="tuiMetaField" value="tui"/>
<property key="anatomicalSiteTuis"
value="T021,T022,T023,T024,T025,T026,T029,T030"/>
<property key="procedureTuis" value="T059,T060,T061"/>
<property key="disorderTuis"
value="T019,T020,T037,T046,T047,T048,T049,T050,T190,T191"/>
<property key="findingTuis"
value="T033,T034,T040,T041,T042,T043,T044,T045,T046,T056,T057,T184"/>
<property key="dbConnExtResrcKey" value="DbConnection"/>
<property key="mapPrepStmt" value="select code from umls_snomed_map where
cui=?"/>
</properties>
</lookupConsumer>
</lookupBinding>
 <lookupBinding>
<dictionaryRef idRef="DICT_RXNORM"/>
<lookupInitializer
className="org.apache.ctakes.dictionary.lookup.ae.FirstTokenPermLookupInitializerImpl">
<properties>
<property key="textMetaFields" value="text"/>
<property key="maxPermutationLevel" value="7"/>
<!-- <property key="windowAnnotations"
value="org.apache.ctakes.typesystem.type.textspan.Sentence"/> -->
<property key="windowAnnotations"
value="org.apache.ctakes.typesystem.type.textspan.LookupWindowAnnotation"/>

<property key="exclusionTags"
value="VB,VBD,VBG,VBN,VBP,VBZ,CC,CD,DT,EX,IN,LS,MD,PDT,POS,PP,PP$,RP,TO,WDT,WP,WPS,WRB"/>
</properties>
</lookupInitializer>
      <lookupConsumer
className="org.apache.ctakes.dictionary.lookup.ae.UmlsToSnomedDbConsumerImpl">
<properties>
          <property key="codingScheme" value="RXNORM"/>
          <property key="cuiMetaField" value="cui"/>
          <property key="tuiMetaField" value="tui"/>
          <property key="medicationTuis"
value="T073,T103,T109,T110,T111,T115,T121,T122,T123,T130,T168,T192,T195,T197,T200,T203"/>
          <property key="dbConnExtResrcKey" value="OrangeBookIndexReader"/>
          <property key="mapPrepStmt" value="select CODERXNORM from
ORANGE_BOOK where CODE=?"/>
</properties>
</lookupConsumer>

On Mon, Sep 22, 2014 at 9:40 AM, Pei Chen <ch...@apache.org> wrote:

> James,
> If you could share some examples, I'll take a closer look.
> But in the meantime, in your LookupDesc_Db.xml [1], I believe you can
> just use the NamedEntityLookupConsumerImpl instead of the
> UmlsToSnomedDbConsumerImpl.
> While you're there, you may also consider expanding the Lookup Window
> or Parts Of Speech exclusions...
>
> [1]
> http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-dictionary-lookup-res/src/main/resources/org/apache/ctakes/dictionary/lookup/LookupDesc_Db.xml
>
> On Mon, Sep 22, 2014 at 2:17 AM, EM Gladiators <em...@gmail.com>
> wrote:
> > Pei,  I am not seeing anything being filtered in the logs.  I checked
> both
> > the output in the terminal and also I set the log in the software to log
> > everything.  How do I go about suppressing the OrangeBook filter and use
> > only Rxnorm?  I'm still not sure the filtering is the problem, because
> > aspirin is also found in OrangeBook.  But, removing the OrgangeBook
> filter
> > would be a good place to start to see if it improves the medication name
> > recognition.
> >
> > Thank you,
> >
> > James
> >
> > On Sun, Sep 21, 2014 at 10:30 PM, Pei Chen <ch...@apache.org> wrote:
> >>
> >> In the logs, do you see something like "Filtered out xyz"?
> >> I think the default configuration has a OrangeBook Filter consumer
> >> applied.  That is only RxNorm concepts that also exist in the
> >> OrangeBook are returned.  I'm not sure if this is causing some of your
> >> drugs to be missed, but worth taking out OrangeBook filter out to
> >> confirm.
> >> We should probably default all RxNorm to be returned and make the
> >> OrangeBook filter optional in the future release.
> >>
> >> On Sun, Sep 21, 2014 at 5:53 PM, EM Gladiators <em...@gmail.com>
> >> wrote:
> >> > I have recently installed ctakes.  I am using the
> >> > AggregatePlaintextUMLSProcessor.  When running the CVD, it is not
> >> > accurately
> >> > annotating all medication names.  Specific medication names that it
> >> > annotates correctly are metformin and nifedipine.  Specific
> medications
> >> > it
> >> > fails to annotate are aspirin and simvastatin.  It is unclear why some
> >> > names
> >> > would be annotated correctly and some not, even though all medication
> >> > names
> >> > are found in rxnorm.  I am afraid that it is not correctly using the
> >> > rxnorm
> >> > files.  Is there a way to check to see if the full rxnorm is being
> >> > searched?
> >> >
> >> > I have not had a problem with snomed CT terms, ctakes does appear to
> be
> >> > capturing these correctly.
> >> >
> >> > Thank you,
> >> >
> >> > James Foster
> >> >
> >> > My setup is as follows...
> >> >
> >> > Mac OS 10.9.4
> >> > Java version 1.8.0_20
> >> > ctakes version 3.2.0 with optional resources package
> >
> >
>

Re: ctakes not capturing all meds/rxnorm IDs

Posted by Pei Chen <ch...@apache.org>.
James,
If you could share some examples, I'll take a closer look.
But in the meantime, in your LookupDesc_Db.xml [1], I believe you can
just use the NamedEntityLookupConsumerImpl instead of the
UmlsToSnomedDbConsumerImpl.
While you're there, you may also consider expanding the Lookup Window
or Parts Of Speech exclusions...

[1] http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-dictionary-lookup-res/src/main/resources/org/apache/ctakes/dictionary/lookup/LookupDesc_Db.xml

On Mon, Sep 22, 2014 at 2:17 AM, EM Gladiators <em...@gmail.com> wrote:
> Pei,  I am not seeing anything being filtered in the logs.  I checked both
> the output in the terminal and also I set the log in the software to log
> everything.  How do I go about suppressing the OrangeBook filter and use
> only Rxnorm?  I'm still not sure the filtering is the problem, because
> aspirin is also found in OrangeBook.  But, removing the OrgangeBook filter
> would be a good place to start to see if it improves the medication name
> recognition.
>
> Thank you,
>
> James
>
> On Sun, Sep 21, 2014 at 10:30 PM, Pei Chen <ch...@apache.org> wrote:
>>
>> In the logs, do you see something like "Filtered out xyz"?
>> I think the default configuration has a OrangeBook Filter consumer
>> applied.  That is only RxNorm concepts that also exist in the
>> OrangeBook are returned.  I'm not sure if this is causing some of your
>> drugs to be missed, but worth taking out OrangeBook filter out to
>> confirm.
>> We should probably default all RxNorm to be returned and make the
>> OrangeBook filter optional in the future release.
>>
>> On Sun, Sep 21, 2014 at 5:53 PM, EM Gladiators <em...@gmail.com>
>> wrote:
>> > I have recently installed ctakes.  I am using the
>> > AggregatePlaintextUMLSProcessor.  When running the CVD, it is not
>> > accurately
>> > annotating all medication names.  Specific medication names that it
>> > annotates correctly are metformin and nifedipine.  Specific medications
>> > it
>> > fails to annotate are aspirin and simvastatin.  It is unclear why some
>> > names
>> > would be annotated correctly and some not, even though all medication
>> > names
>> > are found in rxnorm.  I am afraid that it is not correctly using the
>> > rxnorm
>> > files.  Is there a way to check to see if the full rxnorm is being
>> > searched?
>> >
>> > I have not had a problem with snomed CT terms, ctakes does appear to be
>> > capturing these correctly.
>> >
>> > Thank you,
>> >
>> > James Foster
>> >
>> > My setup is as follows...
>> >
>> > Mac OS 10.9.4
>> > Java version 1.8.0_20
>> > ctakes version 3.2.0 with optional resources package
>
>

Re: ctakes not capturing all meds/rxnorm IDs

Posted by EM Gladiators <em...@gmail.com>.
Pei,  I am not seeing anything being filtered in the logs.  I checked both
the output in the terminal and also I set the log in the software to log
everything.  How do I go about suppressing the OrangeBook filter and use
only Rxnorm?  I'm still not sure the filtering is the problem, because
aspirin is also found in OrangeBook.  But, removing the OrgangeBook filter
would be a good place to start to see if it improves the medication name
recognition.

Thank you,

James

On Sun, Sep 21, 2014 at 10:30 PM, Pei Chen <ch...@apache.org> wrote:

> In the logs, do you see something like "Filtered out xyz"?
> I think the default configuration has a OrangeBook Filter consumer
> applied.  That is only RxNorm concepts that also exist in the
> OrangeBook are returned.  I'm not sure if this is causing some of your
> drugs to be missed, but worth taking out OrangeBook filter out to
> confirm.
> We should probably default all RxNorm to be returned and make the
> OrangeBook filter optional in the future release.
>
> On Sun, Sep 21, 2014 at 5:53 PM, EM Gladiators <em...@gmail.com>
> wrote:
> > I have recently installed ctakes.  I am using the
> > AggregatePlaintextUMLSProcessor.  When running the CVD, it is not
> accurately
> > annotating all medication names.  Specific medication names that it
> > annotates correctly are metformin and nifedipine.  Specific medications
> it
> > fails to annotate are aspirin and simvastatin.  It is unclear why some
> names
> > would be annotated correctly and some not, even though all medication
> names
> > are found in rxnorm.  I am afraid that it is not correctly using the
> rxnorm
> > files.  Is there a way to check to see if the full rxnorm is being
> searched?
> >
> > I have not had a problem with snomed CT terms, ctakes does appear to be
> > capturing these correctly.
> >
> > Thank you,
> >
> > James Foster
> >
> > My setup is as follows...
> >
> > Mac OS 10.9.4
> > Java version 1.8.0_20
> > ctakes version 3.2.0 with optional resources package
>

Re: ctakes not capturing all meds/rxnorm IDs

Posted by Pei Chen <ch...@apache.org>.
In the logs, do you see something like "Filtered out xyz"?
I think the default configuration has a OrangeBook Filter consumer
applied.  That is only RxNorm concepts that also exist in the
OrangeBook are returned.  I'm not sure if this is causing some of your
drugs to be missed, but worth taking out OrangeBook filter out to
confirm.
We should probably default all RxNorm to be returned and make the
OrangeBook filter optional in the future release.

On Sun, Sep 21, 2014 at 5:53 PM, EM Gladiators <em...@gmail.com> wrote:
> I have recently installed ctakes.  I am using the
> AggregatePlaintextUMLSProcessor.  When running the CVD, it is not accurately
> annotating all medication names.  Specific medication names that it
> annotates correctly are metformin and nifedipine.  Specific medications it
> fails to annotate are aspirin and simvastatin.  It is unclear why some names
> would be annotated correctly and some not, even though all medication names
> are found in rxnorm.  I am afraid that it is not correctly using the rxnorm
> files.  Is there a way to check to see if the full rxnorm is being searched?
>
> I have not had a problem with snomed CT terms, ctakes does appear to be
> capturing these correctly.
>
> Thank you,
>
> James Foster
>
> My setup is as follows...
>
> Mac OS 10.9.4
> Java version 1.8.0_20
> ctakes version 3.2.0 with optional resources package