You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ctakes.apache.org by Erin Nicole Gustafson <er...@northwestern.edu> on 2017/02/15 18:38:00 UTC

Phenotype-specific entities

Hi all,

I would like to be able to only identify entities that are relevant for some specific phenotype. One step towards achieving this would be to build a custom dictionary with a limited set of semantic types. However, this is not quite specific enough to only identify mentions related to one disease while ignoring those related to some other disease, for example.

Does cTAKES currently have a way to do this sort of filtering? Or, has anyone developed their own tools that they'd be willing to share?

Thanks,
Erin

Re: Phenotype-specific entities [SUSPICIOUS] [SUSPICIOUS]

Posted by "Dligach, Dmitriy" <dd...@luc.edu>.
Very nice! Thank you, Tim and Sean.

Dima



> On Feb 15, 2017, at 13:25, Miller, Timothy <Ti...@childrens.harvard.edu> wrote:
> 
> Lol was just about to send this:
> https://github.com/tmills/umls-graph-api
> 
> It points at your umls META directory, reads in the ctakes list of TUIs, and builds a neo4j graph database with all the ISA links, and has a simple API for getting parent/child CUIs.
> I used it for coreference.
> Tim
> 
> ________________________________________
> From: Finan, Sean <Se...@childrens.harvard.edu>
> Sent: Wednesday, February 15, 2017 2:23 PM
> To: dev@ctakes.apache.org
> Subject: RE: Phenotype-specific entities [SUSPICIOUS] [SUSPICIOUS]
> 
> The dictionary gui doesn't walk the ontology.  There are umls tables that list relations, wherein things like "isa" (is a) relations may satisfy a hypernym requirement.  If you have the umls rrf files look at mrrel.rrf.  The structure is basically concept1|..| concept2|..|relationtype|..   See section 3.3.9: https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ncbi.nlm.nih.gov_books_NBK9685_&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=agwhlSpSUQ0H6VeJpnACDGcka2fVYSy3HaITKaJN9S8&s=aODQU20_0mAv1i_izwB3RZGOBB0U0ZucFkByxovUJJc&e=
> 
> If anybody has made anything that parses/uses umls relations and can be used by ctakes, please contribute!  Something like a simple traversable umls graphdb would be a great addition ...  Even if it is incomplete or rough, it could be a valuable seed for a new effort.
> 
> Sean
> 
> -----Original Message-----
> From: Savova, Guergana [mailto:Guergana.Savova@childrens.harvard.edu]
> Sent: Wednesday, February 15, 2017 1:54 PM
> To: dev@ctakes.apache.org
> Subject: RE: Phenotype-specific entities [SUSPICIOUS]
> 
> I don't believe there is a tool for walking the UMLS ontology, Dima. But Sean should confirm that his dictionary building tool does not have that functionality.
> 
> I think you can use the UMLS tables to get that information. It has been quite a while I have used these tables, but I remember I was able to get that information from them...
> 
> Sean,
> Does your dictionary building tool implement ontology walking?
> 
> --Guergana
> 
> -----Original Message-----
> From: Dligach, Dmitriy [mailto:ddligach@luc.edu]
> Sent: Wednesday, February 15, 2017 1:50 PM
> To: dev@ctakes.apache.org
> Subject: Re: Phenotype-specific entities
> 
> Guergana, thank you.
> 
> Is there anything in cTAKES now for walking the UMLS ontology (e.g. for finding hypernyms, synonyms, etc.)?
> 
> Dima
> 
> 
> 
>> On Feb 15, 2017, at 12:45, Savova, Guergana <Gu...@childrens.harvard.edu> wrote:
>> 
>> Hi Erin,
>> Yes, creating your customized dictionary is the way to go. You can prune by semantic types of interest and then remove branches that are not relevant to your specific phenotype. I am not aware of cTAKES implementing such a tool for a very customized dictionary.
>> 
>> You can also start with  a few terms that you know are relevant to your phenotype and then find their synonyms in the UMLS. Then, you can further walk a specific ontology and take siblings, parents if you think they are relevant.
>> 
>> Then, there is the whole field of using word embeddings to find synonyms/related terms from unlabeled data  if you want to become really fancy :-) At this point, cTAKES does not implement any deep learning algorithms, in the future we are planning to release a bridge to KERAS.
>> 
>> I hope this makes sense.
>> 
>> --
>> Guergana Savova, PhD, FACMI
>> Associate Professor
>> PI Natural Language Processing Lab
>> Boston Children's Hospital and Harvard Medical School
>> 300 Longwood Avenue
>> Mailstop: BCH3092
>> Enders 144.1
>> Boston, MA 02115
>> Tel: (617) 919-2972
>> Fax: (617) 730-0817
>> Guergana.Savova@childrens.harvard.edu
>> Harvard Scholar: https://urldefense.proofpoint.com/v2/url?u=http-3A__scholar.harvard.edu_guergana-5Fk-5Fsavova_biocv&d=DwIFAw&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=EMsbVKH4fuTPUXGVRWfjw4vqV3ifyKdh-3K3OLUIogI&s=oAz3p_diNUmQdKL6UIfE9Vsnj1T4H5xq6CIof1jXisU&e=
>> ctakes.apache.org
>> thyme.healthnlp.org
>> cancer.healthnlp.org
>> share.healthnlp.org
>> 
>> 
>> -----Original Message-----
>> From: Erin Nicole Gustafson [mailto:erin.gustafson@northwestern.edu]
>> Sent: Wednesday, February 15, 2017 1:38 PM
>> To: dev@ctakes.apache.org
>> Subject: Phenotype-specific entities
>> 
>> Hi all,
>> 
>> I would like to be able to only identify entities that are relevant for some specific phenotype. One step towards achieving this would be to build a custom dictionary with a limited set of semantic types. However, this is not quite specific enough to only identify mentions related to one disease while ignoring those related to some other disease, for example.
>> 
>> Does cTAKES currently have a way to do this sort of filtering? Or, has anyone developed their own tools that they'd be willing to share?
>> 
>> Thanks,
>> Erin
> 


RE: Phenotype-specific entities [SUSPICIOUS] [SUSPICIOUS] [SUSPICIOUS]

Posted by Erin Nicole Gustafson <er...@northwestern.edu>.
Thanks, all!

-Erin


-----Original Message-----
From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu] 
Sent: Wednesday, February 15, 2017 1:28 PM
To: dev@ctakes.apache.org
Subject: RE: Phenotype-specific entities [SUSPICIOUS] [SUSPICIOUS] [SUSPICIOUS]

Hi Tim,
Lol rbay, I remembered your talking about this and wondered if you would bite!
Cheers!
Sean

-----Original Message-----
From: Miller, Timothy [mailto:Timothy.Miller@childrens.harvard.edu] 
Sent: Wednesday, February 15, 2017 2:26 PM
To: dev@ctakes.apache.org
Subject: Re: Phenotype-specific entities [SUSPICIOUS] [SUSPICIOUS] [SUSPICIOUS]

Lol was just about to send this:
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_tmills_umls-2Dgraph-2Dapi&d=DwIFAw&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=qRZhkS_DX1IuuFC4Yk5QiDUt-vqEIeGD7jH-6f_bOFg&s=WhgfFeTaNOm4ZqVuro3orZC12EMUcVlawyJy6RSpj9w&e= 

It points at your umls META directory, reads in the ctakes list of TUIs, and builds a neo4j graph database with all the ISA links, and has a simple API for getting parent/child CUIs.
I used it for coreference.
Tim

________________________________________
From: Finan, Sean <Se...@childrens.harvard.edu>
Sent: Wednesday, February 15, 2017 2:23 PM
To: dev@ctakes.apache.org
Subject: RE: Phenotype-specific entities [SUSPICIOUS] [SUSPICIOUS]

The dictionary gui doesn't walk the ontology.  There are umls tables that list relations, wherein things like "isa" (is a) relations may satisfy a hypernym requirement.  If you have the umls rrf files look at mrrel.rrf.  The structure is basically concept1|..| concept2|..|relationtype|..   See section 3.3.9: https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ncbi.nlm.nih.gov_books_NBK9685_&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=agwhlSpSUQ0H6VeJpnACDGcka2fVYSy3HaITKaJN9S8&s=aODQU20_0mAv1i_izwB3RZGOBB0U0ZucFkByxovUJJc&e=

If anybody has made anything that parses/uses umls relations and can be used by ctakes, please contribute!  Something like a simple traversable umls graphdb would be a great addition ...  Even if it is incomplete or rough, it could be a valuable seed for a new effort.

Sean

-----Original Message-----
From: Savova, Guergana [mailto:Guergana.Savova@childrens.harvard.edu]
Sent: Wednesday, February 15, 2017 1:54 PM
To: dev@ctakes.apache.org
Subject: RE: Phenotype-specific entities [SUSPICIOUS]

I don't believe there is a tool for walking the UMLS ontology, Dima. But Sean should confirm that his dictionary building tool does not have that functionality.

I think you can use the UMLS tables to get that information. It has been quite a while I have used these tables, but I remember I was able to get that information from them...

Sean,
Does your dictionary building tool implement ontology walking?

--Guergana

-----Original Message-----
From: Dligach, Dmitriy [mailto:ddligach@luc.edu]
Sent: Wednesday, February 15, 2017 1:50 PM
To: dev@ctakes.apache.org
Subject: Re: Phenotype-specific entities

Guergana, thank you.

Is there anything in cTAKES now for walking the UMLS ontology (e.g. for finding hypernyms, synonyms, etc.)?

Dima



> On Feb 15, 2017, at 12:45, Savova, Guergana <Gu...@childrens.harvard.edu> wrote:
>
> Hi Erin,
> Yes, creating your customized dictionary is the way to go. You can prune by semantic types of interest and then remove branches that are not relevant to your specific phenotype. I am not aware of cTAKES implementing such a tool for a very customized dictionary.
>
> You can also start with  a few terms that you know are relevant to your phenotype and then find their synonyms in the UMLS. Then, you can further walk a specific ontology and take siblings, parents if you think they are relevant.
>
> Then, there is the whole field of using word embeddings to find synonyms/related terms from unlabeled data  if you want to become really fancy :-) At this point, cTAKES does not implement any deep learning algorithms, in the future we are planning to release a bridge to KERAS.
>
> I hope this makes sense.
>
> --
> Guergana Savova, PhD, FACMI
> Associate Professor
> PI Natural Language Processing Lab
> Boston Children's Hospital and Harvard Medical School
> 300 Longwood Avenue
> Mailstop: BCH3092
> Enders 144.1
> Boston, MA 02115
> Tel: (617) 919-2972
> Fax: (617) 730-0817
> Guergana.Savova@childrens.harvard.edu
> Harvard Scholar: https://urldefense.proofpoint.com/v2/url?u=http-3A__scholar.harvard.edu_guergana-5Fk-5Fsavova_biocv&d=DwIFAw&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=EMsbVKH4fuTPUXGVRWfjw4vqV3ifyKdh-3K3OLUIogI&s=oAz3p_diNUmQdKL6UIfE9Vsnj1T4H5xq6CIof1jXisU&e=
> ctakes.apache.org
> thyme.healthnlp.org
> cancer.healthnlp.org
> share.healthnlp.org
>
>
> -----Original Message-----
> From: Erin Nicole Gustafson [mailto:erin.gustafson@northwestern.edu]
> Sent: Wednesday, February 15, 2017 1:38 PM
> To: dev@ctakes.apache.org
> Subject: Phenotype-specific entities
>
> Hi all,
>
> I would like to be able to only identify entities that are relevant for some specific phenotype. One step towards achieving this would be to build a custom dictionary with a limited set of semantic types. However, this is not quite specific enough to only identify mentions related to one disease while ignoring those related to some other disease, for example.
>
> Does cTAKES currently have a way to do this sort of filtering? Or, has anyone developed their own tools that they'd be willing to share?
>
> Thanks,
> Erin


RE: Phenotype-specific entities [SUSPICIOUS] [SUSPICIOUS] [SUSPICIOUS]

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Hi Tim,
Lol rbay, I remembered your talking about this and wondered if you would bite!
Cheers!
Sean

-----Original Message-----
From: Miller, Timothy [mailto:Timothy.Miller@childrens.harvard.edu] 
Sent: Wednesday, February 15, 2017 2:26 PM
To: dev@ctakes.apache.org
Subject: Re: Phenotype-specific entities [SUSPICIOUS] [SUSPICIOUS] [SUSPICIOUS]

Lol was just about to send this:
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_tmills_umls-2Dgraph-2Dapi&d=DwIFAw&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=qRZhkS_DX1IuuFC4Yk5QiDUt-vqEIeGD7jH-6f_bOFg&s=WhgfFeTaNOm4ZqVuro3orZC12EMUcVlawyJy6RSpj9w&e= 

It points at your umls META directory, reads in the ctakes list of TUIs, and builds a neo4j graph database with all the ISA links, and has a simple API for getting parent/child CUIs.
I used it for coreference.
Tim

________________________________________
From: Finan, Sean <Se...@childrens.harvard.edu>
Sent: Wednesday, February 15, 2017 2:23 PM
To: dev@ctakes.apache.org
Subject: RE: Phenotype-specific entities [SUSPICIOUS] [SUSPICIOUS]

The dictionary gui doesn't walk the ontology.  There are umls tables that list relations, wherein things like "isa" (is a) relations may satisfy a hypernym requirement.  If you have the umls rrf files look at mrrel.rrf.  The structure is basically concept1|..| concept2|..|relationtype|..   See section 3.3.9: https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ncbi.nlm.nih.gov_books_NBK9685_&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=agwhlSpSUQ0H6VeJpnACDGcka2fVYSy3HaITKaJN9S8&s=aODQU20_0mAv1i_izwB3RZGOBB0U0ZucFkByxovUJJc&e=

If anybody has made anything that parses/uses umls relations and can be used by ctakes, please contribute!  Something like a simple traversable umls graphdb would be a great addition ...  Even if it is incomplete or rough, it could be a valuable seed for a new effort.

Sean

-----Original Message-----
From: Savova, Guergana [mailto:Guergana.Savova@childrens.harvard.edu]
Sent: Wednesday, February 15, 2017 1:54 PM
To: dev@ctakes.apache.org
Subject: RE: Phenotype-specific entities [SUSPICIOUS]

I don't believe there is a tool for walking the UMLS ontology, Dima. But Sean should confirm that his dictionary building tool does not have that functionality.

I think you can use the UMLS tables to get that information. It has been quite a while I have used these tables, but I remember I was able to get that information from them...

Sean,
Does your dictionary building tool implement ontology walking?

--Guergana

-----Original Message-----
From: Dligach, Dmitriy [mailto:ddligach@luc.edu]
Sent: Wednesday, February 15, 2017 1:50 PM
To: dev@ctakes.apache.org
Subject: Re: Phenotype-specific entities

Guergana, thank you.

Is there anything in cTAKES now for walking the UMLS ontology (e.g. for finding hypernyms, synonyms, etc.)?

Dima



> On Feb 15, 2017, at 12:45, Savova, Guergana <Gu...@childrens.harvard.edu> wrote:
>
> Hi Erin,
> Yes, creating your customized dictionary is the way to go. You can prune by semantic types of interest and then remove branches that are not relevant to your specific phenotype. I am not aware of cTAKES implementing such a tool for a very customized dictionary.
>
> You can also start with  a few terms that you know are relevant to your phenotype and then find their synonyms in the UMLS. Then, you can further walk a specific ontology and take siblings, parents if you think they are relevant.
>
> Then, there is the whole field of using word embeddings to find synonyms/related terms from unlabeled data  if you want to become really fancy :-) At this point, cTAKES does not implement any deep learning algorithms, in the future we are planning to release a bridge to KERAS.
>
> I hope this makes sense.
>
> --
> Guergana Savova, PhD, FACMI
> Associate Professor
> PI Natural Language Processing Lab
> Boston Children's Hospital and Harvard Medical School
> 300 Longwood Avenue
> Mailstop: BCH3092
> Enders 144.1
> Boston, MA 02115
> Tel: (617) 919-2972
> Fax: (617) 730-0817
> Guergana.Savova@childrens.harvard.edu
> Harvard Scholar: https://urldefense.proofpoint.com/v2/url?u=http-3A__scholar.harvard.edu_guergana-5Fk-5Fsavova_biocv&d=DwIFAw&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=EMsbVKH4fuTPUXGVRWfjw4vqV3ifyKdh-3K3OLUIogI&s=oAz3p_diNUmQdKL6UIfE9Vsnj1T4H5xq6CIof1jXisU&e=
> ctakes.apache.org
> thyme.healthnlp.org
> cancer.healthnlp.org
> share.healthnlp.org
>
>
> -----Original Message-----
> From: Erin Nicole Gustafson [mailto:erin.gustafson@northwestern.edu]
> Sent: Wednesday, February 15, 2017 1:38 PM
> To: dev@ctakes.apache.org
> Subject: Phenotype-specific entities
>
> Hi all,
>
> I would like to be able to only identify entities that are relevant for some specific phenotype. One step towards achieving this would be to build a custom dictionary with a limited set of semantic types. However, this is not quite specific enough to only identify mentions related to one disease while ignoring those related to some other disease, for example.
>
> Does cTAKES currently have a way to do this sort of filtering? Or, has anyone developed their own tools that they'd be willing to share?
>
> Thanks,
> Erin


Re: Phenotype-specific entities [SUSPICIOUS] [SUSPICIOUS]

Posted by "Miller, Timothy" <Ti...@childrens.harvard.edu>.
Lol was just about to send this:
https://github.com/tmills/umls-graph-api

It points at your umls META directory, reads in the ctakes list of TUIs, and builds a neo4j graph database with all the ISA links, and has a simple API for getting parent/child CUIs.
I used it for coreference.
Tim

________________________________________
From: Finan, Sean <Se...@childrens.harvard.edu>
Sent: Wednesday, February 15, 2017 2:23 PM
To: dev@ctakes.apache.org
Subject: RE: Phenotype-specific entities [SUSPICIOUS] [SUSPICIOUS]

The dictionary gui doesn't walk the ontology.  There are umls tables that list relations, wherein things like "isa" (is a) relations may satisfy a hypernym requirement.  If you have the umls rrf files look at mrrel.rrf.  The structure is basically concept1|..| concept2|..|relationtype|..   See section 3.3.9: https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ncbi.nlm.nih.gov_books_NBK9685_&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=agwhlSpSUQ0H6VeJpnACDGcka2fVYSy3HaITKaJN9S8&s=aODQU20_0mAv1i_izwB3RZGOBB0U0ZucFkByxovUJJc&e=

If anybody has made anything that parses/uses umls relations and can be used by ctakes, please contribute!  Something like a simple traversable umls graphdb would be a great addition ...  Even if it is incomplete or rough, it could be a valuable seed for a new effort.

Sean

-----Original Message-----
From: Savova, Guergana [mailto:Guergana.Savova@childrens.harvard.edu]
Sent: Wednesday, February 15, 2017 1:54 PM
To: dev@ctakes.apache.org
Subject: RE: Phenotype-specific entities [SUSPICIOUS]

I don't believe there is a tool for walking the UMLS ontology, Dima. But Sean should confirm that his dictionary building tool does not have that functionality.

I think you can use the UMLS tables to get that information. It has been quite a while I have used these tables, but I remember I was able to get that information from them...

Sean,
Does your dictionary building tool implement ontology walking?

--Guergana

-----Original Message-----
From: Dligach, Dmitriy [mailto:ddligach@luc.edu]
Sent: Wednesday, February 15, 2017 1:50 PM
To: dev@ctakes.apache.org
Subject: Re: Phenotype-specific entities

Guergana, thank you.

Is there anything in cTAKES now for walking the UMLS ontology (e.g. for finding hypernyms, synonyms, etc.)?

Dima



> On Feb 15, 2017, at 12:45, Savova, Guergana <Gu...@childrens.harvard.edu> wrote:
>
> Hi Erin,
> Yes, creating your customized dictionary is the way to go. You can prune by semantic types of interest and then remove branches that are not relevant to your specific phenotype. I am not aware of cTAKES implementing such a tool for a very customized dictionary.
>
> You can also start with  a few terms that you know are relevant to your phenotype and then find their synonyms in the UMLS. Then, you can further walk a specific ontology and take siblings, parents if you think they are relevant.
>
> Then, there is the whole field of using word embeddings to find synonyms/related terms from unlabeled data  if you want to become really fancy :-) At this point, cTAKES does not implement any deep learning algorithms, in the future we are planning to release a bridge to KERAS.
>
> I hope this makes sense.
>
> --
> Guergana Savova, PhD, FACMI
> Associate Professor
> PI Natural Language Processing Lab
> Boston Children's Hospital and Harvard Medical School
> 300 Longwood Avenue
> Mailstop: BCH3092
> Enders 144.1
> Boston, MA 02115
> Tel: (617) 919-2972
> Fax: (617) 730-0817
> Guergana.Savova@childrens.harvard.edu
> Harvard Scholar: https://urldefense.proofpoint.com/v2/url?u=http-3A__scholar.harvard.edu_guergana-5Fk-5Fsavova_biocv&d=DwIFAw&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=EMsbVKH4fuTPUXGVRWfjw4vqV3ifyKdh-3K3OLUIogI&s=oAz3p_diNUmQdKL6UIfE9Vsnj1T4H5xq6CIof1jXisU&e=
> ctakes.apache.org
> thyme.healthnlp.org
> cancer.healthnlp.org
> share.healthnlp.org
>
>
> -----Original Message-----
> From: Erin Nicole Gustafson [mailto:erin.gustafson@northwestern.edu]
> Sent: Wednesday, February 15, 2017 1:38 PM
> To: dev@ctakes.apache.org
> Subject: Phenotype-specific entities
>
> Hi all,
>
> I would like to be able to only identify entities that are relevant for some specific phenotype. One step towards achieving this would be to build a custom dictionary with a limited set of semantic types. However, this is not quite specific enough to only identify mentions related to one disease while ignoring those related to some other disease, for example.
>
> Does cTAKES currently have a way to do this sort of filtering? Or, has anyone developed their own tools that they'd be willing to share?
>
> Thanks,
> Erin


RE: Phenotype-specific entities [SUSPICIOUS]

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
The dictionary gui doesn't walk the ontology.  There are umls tables that list relations, wherein things like "isa" (is a) relations may satisfy a hypernym requirement.  If you have the umls rrf files look at mrrel.rrf.  The structure is basically concept1|..| concept2|..|relationtype|..   See section 3.3.9: https://www.ncbi.nlm.nih.gov/books/NBK9685/

If anybody has made anything that parses/uses umls relations and can be used by ctakes, please contribute!  Something like a simple traversable umls graphdb would be a great addition ...  Even if it is incomplete or rough, it could be a valuable seed for a new effort.

Sean

-----Original Message-----
From: Savova, Guergana [mailto:Guergana.Savova@childrens.harvard.edu] 
Sent: Wednesday, February 15, 2017 1:54 PM
To: dev@ctakes.apache.org
Subject: RE: Phenotype-specific entities [SUSPICIOUS]

I don't believe there is a tool for walking the UMLS ontology, Dima. But Sean should confirm that his dictionary building tool does not have that functionality.

I think you can use the UMLS tables to get that information. It has been quite a while I have used these tables, but I remember I was able to get that information from them...

Sean,
Does your dictionary building tool implement ontology walking?

--Guergana

-----Original Message-----
From: Dligach, Dmitriy [mailto:ddligach@luc.edu] 
Sent: Wednesday, February 15, 2017 1:50 PM
To: dev@ctakes.apache.org
Subject: Re: Phenotype-specific entities

Guergana, thank you. 

Is there anything in cTAKES now for walking the UMLS ontology (e.g. for finding hypernyms, synonyms, etc.)?

Dima



> On Feb 15, 2017, at 12:45, Savova, Guergana <Gu...@childrens.harvard.edu> wrote:
> 
> Hi Erin,
> Yes, creating your customized dictionary is the way to go. You can prune by semantic types of interest and then remove branches that are not relevant to your specific phenotype. I am not aware of cTAKES implementing such a tool for a very customized dictionary.
> 
> You can also start with  a few terms that you know are relevant to your phenotype and then find their synonyms in the UMLS. Then, you can further walk a specific ontology and take siblings, parents if you think they are relevant.
> 
> Then, there is the whole field of using word embeddings to find synonyms/related terms from unlabeled data  if you want to become really fancy :-) At this point, cTAKES does not implement any deep learning algorithms, in the future we are planning to release a bridge to KERAS. 
> 
> I hope this makes sense.
> 
> --
> Guergana Savova, PhD, FACMI
> Associate Professor
> PI Natural Language Processing Lab
> Boston Children's Hospital and Harvard Medical School
> 300 Longwood Avenue
> Mailstop: BCH3092
> Enders 144.1
> Boston, MA 02115
> Tel: (617) 919-2972
> Fax: (617) 730-0817
> Guergana.Savova@childrens.harvard.edu
> Harvard Scholar: https://urldefense.proofpoint.com/v2/url?u=http-3A__scholar.harvard.edu_guergana-5Fk-5Fsavova_biocv&d=DwIFAw&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=EMsbVKH4fuTPUXGVRWfjw4vqV3ifyKdh-3K3OLUIogI&s=oAz3p_diNUmQdKL6UIfE9Vsnj1T4H5xq6CIof1jXisU&e= 
> ctakes.apache.org
> thyme.healthnlp.org
> cancer.healthnlp.org
> share.healthnlp.org
> 
> 
> -----Original Message-----
> From: Erin Nicole Gustafson [mailto:erin.gustafson@northwestern.edu] 
> Sent: Wednesday, February 15, 2017 1:38 PM
> To: dev@ctakes.apache.org
> Subject: Phenotype-specific entities
> 
> Hi all,
> 
> I would like to be able to only identify entities that are relevant for some specific phenotype. One step towards achieving this would be to build a custom dictionary with a limited set of semantic types. However, this is not quite specific enough to only identify mentions related to one disease while ignoring those related to some other disease, for example.
> 
> Does cTAKES currently have a way to do this sort of filtering? Or, has anyone developed their own tools that they'd be willing to share?
> 
> Thanks,
> Erin


RE: Phenotype-specific entities

Posted by "Savova, Guergana" <Gu...@childrens.harvard.edu>.
I don't believe there is a tool for walking the UMLS ontology, Dima. But Sean should confirm that his dictionary building tool does not have that functionality.

I think you can use the UMLS tables to get that information. It has been quite a while I have used these tables, but I remember I was able to get that information from them...

Sean,
Does your dictionary building tool implement ontology walking?

--Guergana

-----Original Message-----
From: Dligach, Dmitriy [mailto:ddligach@luc.edu] 
Sent: Wednesday, February 15, 2017 1:50 PM
To: dev@ctakes.apache.org
Subject: Re: Phenotype-specific entities

Guergana, thank you. 

Is there anything in cTAKES now for walking the UMLS ontology (e.g. for finding hypernyms, synonyms, etc.)?

Dima



> On Feb 15, 2017, at 12:45, Savova, Guergana <Gu...@childrens.harvard.edu> wrote:
> 
> Hi Erin,
> Yes, creating your customized dictionary is the way to go. You can prune by semantic types of interest and then remove branches that are not relevant to your specific phenotype. I am not aware of cTAKES implementing such a tool for a very customized dictionary.
> 
> You can also start with  a few terms that you know are relevant to your phenotype and then find their synonyms in the UMLS. Then, you can further walk a specific ontology and take siblings, parents if you think they are relevant.
> 
> Then, there is the whole field of using word embeddings to find synonyms/related terms from unlabeled data  if you want to become really fancy :-) At this point, cTAKES does not implement any deep learning algorithms, in the future we are planning to release a bridge to KERAS. 
> 
> I hope this makes sense.
> 
> --
> Guergana Savova, PhD, FACMI
> Associate Professor
> PI Natural Language Processing Lab
> Boston Children's Hospital and Harvard Medical School
> 300 Longwood Avenue
> Mailstop: BCH3092
> Enders 144.1
> Boston, MA 02115
> Tel: (617) 919-2972
> Fax: (617) 730-0817
> Guergana.Savova@childrens.harvard.edu
> Harvard Scholar: https://urldefense.proofpoint.com/v2/url?u=http-3A__scholar.harvard.edu_guergana-5Fk-5Fsavova_biocv&d=DwIFAw&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=EMsbVKH4fuTPUXGVRWfjw4vqV3ifyKdh-3K3OLUIogI&s=oAz3p_diNUmQdKL6UIfE9Vsnj1T4H5xq6CIof1jXisU&e= 
> ctakes.apache.org
> thyme.healthnlp.org
> cancer.healthnlp.org
> share.healthnlp.org
> 
> 
> -----Original Message-----
> From: Erin Nicole Gustafson [mailto:erin.gustafson@northwestern.edu] 
> Sent: Wednesday, February 15, 2017 1:38 PM
> To: dev@ctakes.apache.org
> Subject: Phenotype-specific entities
> 
> Hi all,
> 
> I would like to be able to only identify entities that are relevant for some specific phenotype. One step towards achieving this would be to build a custom dictionary with a limited set of semantic types. However, this is not quite specific enough to only identify mentions related to one disease while ignoring those related to some other disease, for example.
> 
> Does cTAKES currently have a way to do this sort of filtering? Or, has anyone developed their own tools that they'd be willing to share?
> 
> Thanks,
> Erin


Re: Phenotype-specific entities

Posted by "Dligach, Dmitriy" <dd...@luc.edu>.
Guergana, thank you. 

Is there anything in cTAKES now for walking the UMLS ontology (e.g. for finding hypernyms, synonyms, etc.)?

Dima



> On Feb 15, 2017, at 12:45, Savova, Guergana <Gu...@childrens.harvard.edu> wrote:
> 
> Hi Erin,
> Yes, creating your customized dictionary is the way to go. You can prune by semantic types of interest and then remove branches that are not relevant to your specific phenotype. I am not aware of cTAKES implementing such a tool for a very customized dictionary.
> 
> You can also start with  a few terms that you know are relevant to your phenotype and then find their synonyms in the UMLS. Then, you can further walk a specific ontology and take siblings, parents if you think they are relevant.
> 
> Then, there is the whole field of using word embeddings to find synonyms/related terms from unlabeled data  if you want to become really fancy :-) At this point, cTAKES does not implement any deep learning algorithms, in the future we are planning to release a bridge to KERAS. 
> 
> I hope this makes sense.
> 
> --
> Guergana Savova, PhD, FACMI
> Associate Professor
> PI Natural Language Processing Lab
> Boston Children's Hospital and Harvard Medical School
> 300 Longwood Avenue
> Mailstop: BCH3092
> Enders 144.1
> Boston, MA 02115
> Tel: (617) 919-2972
> Fax: (617) 730-0817
> Guergana.Savova@childrens.harvard.edu
> Harvard Scholar: http://scholar.harvard.edu/guergana_k_savova/biocv
> ctakes.apache.org
> thyme.healthnlp.org
> cancer.healthnlp.org
> share.healthnlp.org
> 
> 
> -----Original Message-----
> From: Erin Nicole Gustafson [mailto:erin.gustafson@northwestern.edu] 
> Sent: Wednesday, February 15, 2017 1:38 PM
> To: dev@ctakes.apache.org
> Subject: Phenotype-specific entities
> 
> Hi all,
> 
> I would like to be able to only identify entities that are relevant for some specific phenotype. One step towards achieving this would be to build a custom dictionary with a limited set of semantic types. However, this is not quite specific enough to only identify mentions related to one disease while ignoring those related to some other disease, for example.
> 
> Does cTAKES currently have a way to do this sort of filtering? Or, has anyone developed their own tools that they'd be willing to share?
> 
> Thanks,
> Erin


RE: Phenotype-specific entities

Posted by "Savova, Guergana" <Gu...@childrens.harvard.edu>.
Hi Erin,
Yes, creating your customized dictionary is the way to go. You can prune by semantic types of interest and then remove branches that are not relevant to your specific phenotype. I am not aware of cTAKES implementing such a tool for a very customized dictionary.

You can also start with  a few terms that you know are relevant to your phenotype and then find their synonyms in the UMLS. Then, you can further walk a specific ontology and take siblings, parents if you think they are relevant.

Then, there is the whole field of using word embeddings to find synonyms/related terms from unlabeled data  if you want to become really fancy :-) At this point, cTAKES does not implement any deep learning algorithms, in the future we are planning to release a bridge to KERAS. 

I hope this makes sense.

--
Guergana Savova, PhD, FACMI
Associate Professor
PI Natural Language Processing Lab
Boston Children's Hospital and Harvard Medical School
300 Longwood Avenue
Mailstop: BCH3092
Enders 144.1
Boston, MA 02115
Tel: (617) 919-2972
Fax: (617) 730-0817
Guergana.Savova@childrens.harvard.edu
Harvard Scholar: http://scholar.harvard.edu/guergana_k_savova/biocv
ctakes.apache.org
thyme.healthnlp.org
cancer.healthnlp.org
share.healthnlp.org


-----Original Message-----
From: Erin Nicole Gustafson [mailto:erin.gustafson@northwestern.edu] 
Sent: Wednesday, February 15, 2017 1:38 PM
To: dev@ctakes.apache.org
Subject: Phenotype-specific entities

Hi all,

I would like to be able to only identify entities that are relevant for some specific phenotype. One step towards achieving this would be to build a custom dictionary with a limited set of semantic types. However, this is not quite specific enough to only identify mentions related to one disease while ignoring those related to some other disease, for example.

Does cTAKES currently have a way to do this sort of filtering? Or, has anyone developed their own tools that they'd be willing to share?

Thanks,
Erin