You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ctakes.apache.org by "Finan, Sean" <Se...@childrens.harvard.edu> on 2017/08/28 01:04:57 UTC

RE: ICD10 Dictionary Issue [EXTERNAL]

Hi Matthew,

One of the updates that happened with ctakes 4.0 was the long-awaited update to use the current version of hypersql (hsqldb) used by components such as the dictionary lookup modules.  That update made all of the older dictionary databases obsolete as their format does not match that used by the current hsqldb.

Your best option is probably to create an updated '17 icd dictionary of your own.  Have a look at the wiki introduction to the dictionary creator gui:
https://cwiki.apache.org/confluence/display/CTAKES/Dictionary+Creator+GUI

Sean

-----Original Message-----
From: Matthew Vita [mailto:matthewvita48@gmail.com] 
Sent: Sunday, August 27, 2017 3:29 PM
To: dev@ctakes.apache.org
Subject: ICD10 Dictionary Issue [EXTERNAL]

Hi cTAKES Maintainers,

My name is Matthew Vita, a healthcare software developer and one of the OpenEMR project administrators (this is a popular open source EHR mainly used outside of the US).

I am interested in integrating cTAKES with the EMR using Docker and a friendly web frontend. Fortunately, Dr. Timothy Miller has provided an excellent Docker pipeline solution that I have been using and enhancing.
However, I'm having an issue with ICD10 dictionary support. Introducing
ctakesicd2015 simply has no effect.

When observing the SNOMED/RXNORM dictionary, the structure is as follows:

- sno_rx_16ab.xml
- sno_rx_16ab
  - sno_rx_16ab.properties
  - sno_rx_16ab.script

However, when one pulls down the
https://urldefense.proofpoint.com/v2/url?u=https-3A__sourceforge.net_p_ctakesresources_code_HEAD_tree_trunk_ctakes-2Dresources-2Dsnomed-2Drword-2Dhsqldb-2D2011ab_src_main_resources_org_apache_ctakes_dictionary_lookup_fast_ctakesicd2015&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=bmYyxtv1U77r3grgB76TXATo01ayCe_Gu1kOvL5VvD8&s=XWhnGpcbuQvkPzwgSYg9uuRlzBBm_K_GwEprLsJ0QJk&e= , the structure is:

- ctakesicd2015.script
- ctakesicd2015.properties

cTAKES does not pick up these files. I wonder if it's because the ctakesicd2015.xml manifest is missing. Can anyone point me to a proper download of this file?

I very much appreciate the hard work behind this project and look forward to hearing back.

Thanks,

Matthew Vita
www.matthewvita.com

Re: ICD10 Dictionary Issue [EXTERNAL]

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Hi Matthew,

That is great!  I am trying to be on vacation but can't wait to check out the video!

Sean
________________________________________
From: Matthew Vita <ma...@gmail.com>
Sent: Friday, September 1, 2017 11:24 PM
To: dev@ctakes.apache.org
Subject: Re: ICD10 Dictionary Issue [EXTERNAL]

Sean,

I am pleased to inform you that I created a YouTube video for creating
ICD10 dictionaries for cTAKES: https://urldefense.proofpoint.com/v2/url?u=https-3A__www.youtube.com_watch-3Fv-3D4aOnafv-2DNQs&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=yytpXZalgLQI4oY7ELBurCxIbP-GKkjS3AeHNoehsxQ&s=eCy04Y5cdHQPEPFN4J2CTVmZXIVpk06JYPvd9czyh7o&e= .
Thank you again for the tips!

I haven't actually hooked up my cTAKES to use it (
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_tmills_ctakes-2Ddocker_blob_master_ctakes-2Das-2Dpipeline_Dockerfile&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=yytpXZalgLQI4oY7ELBurCxIbP-GKkjS3AeHNoehsxQ&s=j1_OFYmNeIpUdo0jZkbUEjBMd6vEKYQ5nnUEnaGqYrU&e=
is
what I will be using to pull it in), but I'm pretty sure I am on the right
track! Please let me know if I've missed anything.

Thanks,

Matthew Vita
www.matthewvita.com


On Mon, Aug 28, 2017 at 8:48 AM Finan, Sean <
Sean.Finan@childrens.harvard.edu> wrote:

> Hi Matthew,
>
> The dictionary creator uses a 'full' umls installation to create the
> custom dictionary.  After that any installations of ctakes that contain
> your custom dictionary do not need the umls data.
>
> Sean
>
> -----Original Message-----
> From: Matthew Vita [mailto:matthewvita48@gmail.com]
> Sent: Sunday, August 27, 2017 9:20 PM
> To: dev@ctakes.apache.org
> Subject: Re: ICD10 Dictionary Issue [EXTERNAL]
>
> Sean,
>
> Nice to meet you. Thank you for your high-quality work on cTAKES.
>
> I will create the dictionary with the latest ICD10 data and contribute
> back my documentation + artifacts. My only question is with step 3:
> *"Select a UMLS installation directory. This is the directory containing
> the META/ subdirectory (which contains RRF files). After selecting the UMLS
> installation directory, the available vocabularies are gathered."* -
> basically, I have been providing my UMLS user/pass to the cTAKES Docker
> solution and letting it take care of the rest. I suppose I'm going to have
> to download and configure a custom UMLS dataset installation. This is
> correct?
>
> Thanks,
>
> Matthew Vita
> www.matthewvita.com
>
>
> On Sun, Aug 27, 2017 at 9:05 PM Finan, Sean <
> Sean.Finan@childrens.harvard.edu> wrote:
>
> > Hi Matthew,
> >
> > One of the updates that happened with ctakes 4.0 was the long-awaited
> > update to use the current version of hypersql (hsqldb) used by
> > components such as the dictionary lookup modules.  That update made
> > all of the older dictionary databases obsolete as their format does
> > not match that used by the current hsqldb.
> >
> > Your best option is probably to create an updated '17 icd dictionary
> > of your own.  Have a look at the wiki introduction to the dictionary
> > creator
> > gui:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> > confluence_display_CTAKES_Dictionary-2BCreator-2BGUI&d=DwIFaQ&c=qS4goW
> > BT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> > pKGd4f7d4gTao&m=rn9A67xy_cGUI60xNsFsR3RpAKFa3Sqcq7CFhFkHdaQ&s=QnFljs_2
> > PkWy6ow43XdWzMHG9KW_4czYGAwy3gR3Oh0&e=
> >
> > Sean
> >
> > -----Original Message-----
> > From: Matthew Vita [mailto:matthewvita48@gmail.com]
> > Sent: Sunday, August 27, 2017 3:29 PM
> > To: dev@ctakes.apache.org
> > Subject: ICD10 Dictionary Issue [EXTERNAL]
> >
> > Hi cTAKES Maintainers,
> >
> > My name is Matthew Vita, a healthcare software developer and one of
> > the OpenEMR project administrators (this is a popular open source EHR
> > mainly used outside of the US).
> >
> > I am interested in integrating cTAKES with the EMR using Docker and a
> > friendly web frontend. Fortunately, Dr. Timothy Miller has provided an
> > excellent Docker pipeline solution that I have been using and enhancing.
> > However, I'm having an issue with ICD10 dictionary support.
> > Introducing
> > ctakesicd2015 simply has no effect.
> >
> > When observing the SNOMED/RXNORM dictionary, the structure is as follows:
> >
> > - sno_rx_16ab.xml
> > - sno_rx_16ab
> >   - sno_rx_16ab.properties
> >   - sno_rx_16ab.script
> >
> > However, when one pulls down the
> >
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__sourceforge.net_p
> > _ctakesresources_code_HEAD_tree_trunk_ctakes-2Dresources-2Dsnomed-2Drw
> > ord-2Dhsqldb-2D2011ab_src_main_resources_org_apache_ctakes_dictionary_
> > lookup_fast_ctakesicd2015&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSd
> > ioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=bmYyxtv1U77
> > r3grgB76TXATo01ayCe_Gu1kOvL5VvD8&s=XWhnGpcbuQvkPzwgSYg9uuRlzBBm_K_GwEp
> > rLsJ0QJk&e=
> > , the structure is:
> >
> > - ctakesicd2015.script
> > - ctakesicd2015.properties
> >
> > cTAKES does not pick up these files. I wonder if it's because the
> > ctakesicd2015.xml manifest is missing. Can anyone point me to a proper
> > download of this file?
> >
> > I very much appreciate the hard work behind this project and look
> > forward to hearing back.
> >
> > Thanks,
> >
> > Matthew Vita
> > www.matthewvita.com
> >
>

Re: ICD10 Dictionary Issue [EXTERNAL]

Posted by Matthew Vita <ma...@gmail.com>.
Sean,

I am pleased to inform you that I created a YouTube video for creating
ICD10 dictionaries for cTAKES: https://www.youtube.com/watch?v=4aOnafv-NQs.
Thank you again for the tips!

I haven't actually hooked up my cTAKES to use it (
https://github.com/tmills/ctakes-docker/blob/master/ctakes-as-pipeline/Dockerfile
is
what I will be using to pull it in), but I'm pretty sure I am on the right
track! Please let me know if I've missed anything.

Thanks,

Matthew Vita
www.matthewvita.com


On Mon, Aug 28, 2017 at 8:48 AM Finan, Sean <
Sean.Finan@childrens.harvard.edu> wrote:

> Hi Matthew,
>
> The dictionary creator uses a 'full' umls installation to create the
> custom dictionary.  After that any installations of ctakes that contain
> your custom dictionary do not need the umls data.
>
> Sean
>
> -----Original Message-----
> From: Matthew Vita [mailto:matthewvita48@gmail.com]
> Sent: Sunday, August 27, 2017 9:20 PM
> To: dev@ctakes.apache.org
> Subject: Re: ICD10 Dictionary Issue [EXTERNAL]
>
> Sean,
>
> Nice to meet you. Thank you for your high-quality work on cTAKES.
>
> I will create the dictionary with the latest ICD10 data and contribute
> back my documentation + artifacts. My only question is with step 3:
> *"Select a UMLS installation directory. This is the directory containing
> the META/ subdirectory (which contains RRF files). After selecting the UMLS
> installation directory, the available vocabularies are gathered."* -
> basically, I have been providing my UMLS user/pass to the cTAKES Docker
> solution and letting it take care of the rest. I suppose I'm going to have
> to download and configure a custom UMLS dataset installation. This is
> correct?
>
> Thanks,
>
> Matthew Vita
> www.matthewvita.com
>
>
> On Sun, Aug 27, 2017 at 9:05 PM Finan, Sean <
> Sean.Finan@childrens.harvard.edu> wrote:
>
> > Hi Matthew,
> >
> > One of the updates that happened with ctakes 4.0 was the long-awaited
> > update to use the current version of hypersql (hsqldb) used by
> > components such as the dictionary lookup modules.  That update made
> > all of the older dictionary databases obsolete as their format does
> > not match that used by the current hsqldb.
> >
> > Your best option is probably to create an updated '17 icd dictionary
> > of your own.  Have a look at the wiki introduction to the dictionary
> > creator
> > gui:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> > confluence_display_CTAKES_Dictionary-2BCreator-2BGUI&d=DwIFaQ&c=qS4goW
> > BT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> > pKGd4f7d4gTao&m=rn9A67xy_cGUI60xNsFsR3RpAKFa3Sqcq7CFhFkHdaQ&s=QnFljs_2
> > PkWy6ow43XdWzMHG9KW_4czYGAwy3gR3Oh0&e=
> >
> > Sean
> >
> > -----Original Message-----
> > From: Matthew Vita [mailto:matthewvita48@gmail.com]
> > Sent: Sunday, August 27, 2017 3:29 PM
> > To: dev@ctakes.apache.org
> > Subject: ICD10 Dictionary Issue [EXTERNAL]
> >
> > Hi cTAKES Maintainers,
> >
> > My name is Matthew Vita, a healthcare software developer and one of
> > the OpenEMR project administrators (this is a popular open source EHR
> > mainly used outside of the US).
> >
> > I am interested in integrating cTAKES with the EMR using Docker and a
> > friendly web frontend. Fortunately, Dr. Timothy Miller has provided an
> > excellent Docker pipeline solution that I have been using and enhancing.
> > However, I'm having an issue with ICD10 dictionary support.
> > Introducing
> > ctakesicd2015 simply has no effect.
> >
> > When observing the SNOMED/RXNORM dictionary, the structure is as follows:
> >
> > - sno_rx_16ab.xml
> > - sno_rx_16ab
> >   - sno_rx_16ab.properties
> >   - sno_rx_16ab.script
> >
> > However, when one pulls down the
> >
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__sourceforge.net_p
> > _ctakesresources_code_HEAD_tree_trunk_ctakes-2Dresources-2Dsnomed-2Drw
> > ord-2Dhsqldb-2D2011ab_src_main_resources_org_apache_ctakes_dictionary_
> > lookup_fast_ctakesicd2015&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSd
> > ioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=bmYyxtv1U77
> > r3grgB76TXATo01ayCe_Gu1kOvL5VvD8&s=XWhnGpcbuQvkPzwgSYg9uuRlzBBm_K_GwEp
> > rLsJ0QJk&e=
> > , the structure is:
> >
> > - ctakesicd2015.script
> > - ctakesicd2015.properties
> >
> > cTAKES does not pick up these files. I wonder if it's because the
> > ctakesicd2015.xml manifest is missing. Can anyone point me to a proper
> > download of this file?
> >
> > I very much appreciate the hard work behind this project and look
> > forward to hearing back.
> >
> > Thanks,
> >
> > Matthew Vita
> > www.matthewvita.com
> >
>

RE: ICD10 Dictionary Issue [EXTERNAL]

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Hi Matthew,

The dictionary creator uses a 'full' umls installation to create the custom dictionary.  After that any installations of ctakes that contain your custom dictionary do not need the umls data.

Sean

-----Original Message-----
From: Matthew Vita [mailto:matthewvita48@gmail.com] 
Sent: Sunday, August 27, 2017 9:20 PM
To: dev@ctakes.apache.org
Subject: Re: ICD10 Dictionary Issue [EXTERNAL]

Sean,

Nice to meet you. Thank you for your high-quality work on cTAKES.

I will create the dictionary with the latest ICD10 data and contribute back my documentation + artifacts. My only question is with step 3: *"Select a UMLS installation directory. This is the directory containing the META/ subdirectory (which contains RRF files). After selecting the UMLS installation directory, the available vocabularies are gathered."* - basically, I have been providing my UMLS user/pass to the cTAKES Docker solution and letting it take care of the rest. I suppose I'm going to have to download and configure a custom UMLS dataset installation. This is correct?

Thanks,

Matthew Vita
www.matthewvita.com


On Sun, Aug 27, 2017 at 9:05 PM Finan, Sean < Sean.Finan@childrens.harvard.edu> wrote:

> Hi Matthew,
>
> One of the updates that happened with ctakes 4.0 was the long-awaited 
> update to use the current version of hypersql (hsqldb) used by 
> components such as the dictionary lookup modules.  That update made 
> all of the older dictionary databases obsolete as their format does 
> not match that used by the current hsqldb.
>
> Your best option is probably to create an updated '17 icd dictionary 
> of your own.  Have a look at the wiki introduction to the dictionary 
> creator
> gui:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> confluence_display_CTAKES_Dictionary-2BCreator-2BGUI&d=DwIFaQ&c=qS4goW
> BT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> pKGd4f7d4gTao&m=rn9A67xy_cGUI60xNsFsR3RpAKFa3Sqcq7CFhFkHdaQ&s=QnFljs_2
> PkWy6ow43XdWzMHG9KW_4czYGAwy3gR3Oh0&e=
>
> Sean
>
> -----Original Message-----
> From: Matthew Vita [mailto:matthewvita48@gmail.com]
> Sent: Sunday, August 27, 2017 3:29 PM
> To: dev@ctakes.apache.org
> Subject: ICD10 Dictionary Issue [EXTERNAL]
>
> Hi cTAKES Maintainers,
>
> My name is Matthew Vita, a healthcare software developer and one of 
> the OpenEMR project administrators (this is a popular open source EHR 
> mainly used outside of the US).
>
> I am interested in integrating cTAKES with the EMR using Docker and a 
> friendly web frontend. Fortunately, Dr. Timothy Miller has provided an 
> excellent Docker pipeline solution that I have been using and enhancing.
> However, I'm having an issue with ICD10 dictionary support. 
> Introducing
> ctakesicd2015 simply has no effect.
>
> When observing the SNOMED/RXNORM dictionary, the structure is as follows:
>
> - sno_rx_16ab.xml
> - sno_rx_16ab
>   - sno_rx_16ab.properties
>   - sno_rx_16ab.script
>
> However, when one pulls down the
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__sourceforge.net_p
> _ctakesresources_code_HEAD_tree_trunk_ctakes-2Dresources-2Dsnomed-2Drw
> ord-2Dhsqldb-2D2011ab_src_main_resources_org_apache_ctakes_dictionary_
> lookup_fast_ctakesicd2015&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSd
> ioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=bmYyxtv1U77
> r3grgB76TXATo01ayCe_Gu1kOvL5VvD8&s=XWhnGpcbuQvkPzwgSYg9uuRlzBBm_K_GwEp
> rLsJ0QJk&e=
> , the structure is:
>
> - ctakesicd2015.script
> - ctakesicd2015.properties
>
> cTAKES does not pick up these files. I wonder if it's because the 
> ctakesicd2015.xml manifest is missing. Can anyone point me to a proper 
> download of this file?
>
> I very much appreciate the hard work behind this project and look 
> forward to hearing back.
>
> Thanks,
>
> Matthew Vita
> www.matthewvita.com
>

Re: ICD10 Dictionary Issue [EXTERNAL]

Posted by Matthew Vita <ma...@gmail.com>.
Sean,

Nice to meet you. Thank you for your high-quality work on cTAKES.

I will create the dictionary with the latest ICD10 data and contribute back
my documentation + artifacts. My only question is with step 3: *"Select a
UMLS installation directory. This is the directory containing the META/
subdirectory (which contains RRF files). After selecting the UMLS
installation directory, the available vocabularies are gathered."* -
basically, I have been providing my UMLS user/pass to the cTAKES Docker
solution and letting it take care of the rest. I suppose I'm going to have
to download and configure a custom UMLS dataset installation. This is
correct?

Thanks,

Matthew Vita
www.matthewvita.com


On Sun, Aug 27, 2017 at 9:05 PM Finan, Sean <
Sean.Finan@childrens.harvard.edu> wrote:

> Hi Matthew,
>
> One of the updates that happened with ctakes 4.0 was the long-awaited
> update to use the current version of hypersql (hsqldb) used by components
> such as the dictionary lookup modules.  That update made all of the older
> dictionary databases obsolete as their format does not match that used by
> the current hsqldb.
>
> Your best option is probably to create an updated '17 icd dictionary of
> your own.  Have a look at the wiki introduction to the dictionary creator
> gui:
> https://cwiki.apache.org/confluence/display/CTAKES/Dictionary+Creator+GUI
>
> Sean
>
> -----Original Message-----
> From: Matthew Vita [mailto:matthewvita48@gmail.com]
> Sent: Sunday, August 27, 2017 3:29 PM
> To: dev@ctakes.apache.org
> Subject: ICD10 Dictionary Issue [EXTERNAL]
>
> Hi cTAKES Maintainers,
>
> My name is Matthew Vita, a healthcare software developer and one of the
> OpenEMR project administrators (this is a popular open source EHR mainly
> used outside of the US).
>
> I am interested in integrating cTAKES with the EMR using Docker and a
> friendly web frontend. Fortunately, Dr. Timothy Miller has provided an
> excellent Docker pipeline solution that I have been using and enhancing.
> However, I'm having an issue with ICD10 dictionary support. Introducing
> ctakesicd2015 simply has no effect.
>
> When observing the SNOMED/RXNORM dictionary, the structure is as follows:
>
> - sno_rx_16ab.xml
> - sno_rx_16ab
>   - sno_rx_16ab.properties
>   - sno_rx_16ab.script
>
> However, when one pulls down the
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__sourceforge.net_p_ctakesresources_code_HEAD_tree_trunk_ctakes-2Dresources-2Dsnomed-2Drword-2Dhsqldb-2D2011ab_src_main_resources_org_apache_ctakes_dictionary_lookup_fast_ctakesicd2015&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=bmYyxtv1U77r3grgB76TXATo01ayCe_Gu1kOvL5VvD8&s=XWhnGpcbuQvkPzwgSYg9uuRlzBBm_K_GwEprLsJ0QJk&e=
> , the structure is:
>
> - ctakesicd2015.script
> - ctakesicd2015.properties
>
> cTAKES does not pick up these files. I wonder if it's because the
> ctakesicd2015.xml manifest is missing. Can anyone point me to a proper
> download of this file?
>
> I very much appreciate the hard work behind this project and look forward
> to hearing back.
>
> Thanks,
>
> Matthew Vita
> www.matthewvita.com
>