You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ctakes.apache.org by Alan Simmons <al...@tempus.com> on 2016/12/27 22:06:11 UTC
expanding cTAKES to use concepts from vocabularies other than SNOMED
and RXNorm
Hi. I've been working with cTAKES for a few weeks now. I'm running the
standard CPE from the command line and generating CAS files that include
SNOMED and RxNorm concepts.
I'd like to expand my annotation to include concepts from vocabularies
other than SNOMED and RxNORM--specifically, terms from the NCI Thesaurus
for cancer-specific terms that are not in SNOMED--e.g., "Stage IB non-small
cell lung cancer" (UMLS CUI C1336139). What's the best way to accomplish
this?
Regards,
Alan Simmons
--
J. Alan Simmons
Solution Architect
(c) +1.773.220.5018
--
This email and any attachments may contain privileged and confidential
information and/or protected health information (PHI) that is protected by
federal and state privacy laws. It is intended solely for the use of
Tempus Labs and the recipient(s) named above. Nothing contained in this
communication and any attachments thereto is intended to waive any
privileges or rights of confidentiality. If you are not the recipient, or
the employee or agent responsible for delivering this message to the
intended recipient, you are hereby notified that any review, dissemination,
distribution, printing or copying of this email message and/or any
attachments is strictly prohibited. * If you have received this
transmission in error, please notify us immediately at **(877)-654-5544** and
permanently delete this email and any attachments*.
Re: expanding cTAKES to use concepts from vocabularies other than
SNOMED and RXNorm
Posted by Alan Simmons <al...@tempus.com>.
Thanks, Guergana and Sean.
We've been using the dictionary tool to build an updated UMLS dictionary,
but have only been seeing SNOMED and RxNorm concepts in our output. After
receiving your message, we reviewed the dictionary tool.
It seems to us that to add a dictionary other than the defaults (SNOMED,
RxNorm, and ICD), we would need to make significant changes, including some
hard coding in a Java class. Before we go that route, we thought that we'd
ask for a sanity check.
It appears that we would need to:
- Include new vocabularies in the dictionarytool's
ConversionSources.txt--making it look more like the "optional" version
instead of the "default" one (i.e.,
https://svn.apache.org/repos/asf/ctakes/sandbox/dictionary-gui/data/default/ConversionSources.txt).
Easy enough.
- Add custom property keys for the desired dictionaries to the
cTakesHsql.xml file. The default file currently has keys for SNOMED,
RxNorm, ICD-9, and ICD-10. Also straightforward.
- Update the code in the class
org.apache.ctakes.dictionary.lookup2.concept.JdbcConceptFactory. This class
seems to be hard-coded to look for the SNOMED, RxNorm, etc. tags in
cTakesHsql.xml (e.g. <property key="snomedTable" value="snomedct"/>. Then
recompile the class. This is something that we'd rather avoid, of course.
Is that all that we would need to do? Is there a simpler way?
Regards,
Alan
On Tue, Dec 27, 2016 at 7:31 PM, Savova, Guergana <
Guergana.Savova@childrens.harvard.edu> wrote:
> Hi Alan,
>
> There is a module for building a dictionary off any vocabulary. It was
> Sean Finan who wrote the code. Sean is out until Jan 3, I am sure he will
> get back to you when he comes back from the holidays. From what I remember,
> the code is straightforward to use.
>
> Happy Holidays!
>
> --Guergana
>
>
>
> Guergana Savova, PhD, FACMI
>
> Associate Professor
>
> PI Natural Language Processing Lab
>
> Boston Children's Hospital and Harvard Medical School
>
> 300 Longwood Avenue
>
> Mailstop: BCH3092
>
> Enders 144.1
>
> Boston, MA 02115
>
> Tel: (617) 919-2972
>
> Fax: (617) 730-0817
>
> Guergana.Savova@childrens.harvard.edu
>
> Harvard Scholar: http://scholar.harvard.edu/guergana_k_savova/biocv
>
> ctakes.apache.org
>
> thyme.healthnlp.org
>
> cancer.healthnlp.org
>
> share.healthnlp.org
>
>
>
>
>
> *From:* Alan Simmons [mailto:alan.simmons@tempus.com]
> *Sent:* Tuesday, December 27, 2016 5:06 PM
> *To:* user@ctakes.apache.org
> *Subject:* expanding cTAKES to use concepts from vocabularies other than
> SNOMED and RXNorm
>
>
>
> Hi. I've been working with cTAKES for a few weeks now. I'm running the
> standard CPE from the command line and generating CAS files that include
> SNOMED and RxNorm concepts.
>
> I'd like to expand my annotation to include concepts from vocabularies
> other than SNOMED and RxNORM--specifically, terms from the NCI Thesaurus
> for cancer-specific terms that are not in SNOMED--e.g., "Stage IB non-small
> cell lung cancer" (UMLS CUI C1336139). What's the best way to accomplish
> this?
>
> Regards,
>
> Alan Simmons
>
> --
>
> J. Alan Simmons
>
> Solution Architect
>
>
> (c) +1.773.220.5018
>
>
> This email and any attachments may contain privileged and confidential
> information and/or protected health information (PHI) that is protected by
> federal and state privacy laws. It is intended solely for the use of
> Tempus Labs and the recipient(s) named above. Nothing contained in this
> communication and any attachments thereto is intended to waive any
> privileges or rights of confidentiality. If you are not the recipient, or
> the employee or agent responsible for delivering this message to the
> intended recipient, you are hereby notified that any review, dissemination,
> distribution, printing or copying of this email message and/or any
> attachments is strictly prohibited. * If you have received this
> transmission in error, please notify us immediately at **(877)-654-5544
> <%28877%29%20654-5544>** and permanently delete this email and any
> attachments*.
>
--
J. Alan Simmons
Solution Architect
(c) +1.773.220.5018
--
This email and any attachments may contain privileged and confidential
information and/or protected health information (PHI) that is protected by
federal and state privacy laws. It is intended solely for the use of
Tempus Labs and the recipient(s) named above. Nothing contained in this
communication and any attachments thereto is intended to waive any
privileges or rights of confidentiality. If you are not the recipient, or
the employee or agent responsible for delivering this message to the
intended recipient, you are hereby notified that any review, dissemination,
distribution, printing or copying of this email message and/or any
attachments is strictly prohibited. * If you have received this
transmission in error, please notify us immediately at **(877)-654-5544** and
permanently delete this email and any attachments*.
RE: expanding cTAKES to use concepts from vocabularies other than
SNOMED and RXNorm
Posted by "Savova, Guergana" <Gu...@childrens.harvard.edu>.
Hi Alan,
There is a module for building a dictionary off any vocabulary. It was Sean Finan who wrote the code. Sean is out until Jan 3, I am sure he will get back to you when he comes back from the holidays. From what I remember, the code is straightforward to use.
Happy Holidays!
--Guergana
Guergana Savova, PhD, FACMI
Associate Professor
PI Natural Language Processing Lab
Boston Children's Hospital and Harvard Medical School
300 Longwood Avenue
Mailstop: BCH3092
Enders 144.1
Boston, MA 02115
Tel: (617) 919-2972
Fax: (617) 730-0817
Guergana.Savova@childrens.harvard.edu<ma...@childrens.harvard.edu>
Harvard Scholar: http://scholar.harvard.edu/guergana_k_savova/biocv
ctakes.apache.org
thyme.healthnlp.org
cancer.healthnlp.org
share.healthnlp.org
From: Alan Simmons [mailto:alan.simmons@tempus.com]
Sent: Tuesday, December 27, 2016 5:06 PM
To: user@ctakes.apache.org
Subject: expanding cTAKES to use concepts from vocabularies other than SNOMED and RXNorm
Hi. I've been working with cTAKES for a few weeks now. I'm running the standard CPE from the command line and generating CAS files that include SNOMED and RxNorm concepts.
I'd like to expand my annotation to include concepts from vocabularies other than SNOMED and RxNORM--specifically, terms from the NCI Thesaurus for cancer-specific terms that are not in SNOMED--e.g., "Stage IB non-small cell lung cancer" (UMLS CUI C1336139). What's the best way to accomplish this?
Regards,
Alan Simmons
--
J. Alan Simmons
Solution Architect
(c) +1.773.220.5018<tel:%2B1.773.220.5018>
[https://docs.google.com/uc?export=download&id=0B_ZiIlRgT_0DRHBNQ20zYkNvYjg&revid=0B_ZiIlRgT_0DUDFod1hqNWFCbjdCcGRNZ2Q4d3RhaFF6bHZJPQ]
This email and any attachments may contain privileged and confidential information and/or protected health information (PHI) that is protected by federal and state privacy laws. It is intended solely for the use of Tempus Labs and the recipient(s) named above. Nothing contained in this communication and any attachments thereto is intended to waive any privileges or rights of confidentiality. If you are not the recipient, or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any review, dissemination, distribution, printing or copying of this email message and/or any attachments is strictly prohibited. If you have received this transmission in error, please notify us immediately at (877)-654-5544 and permanently delete this email and any attachments.