You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ctakes.apache.org by "Miller, Timothy" <Ti...@childrens.harvard.edu> on 2017/02/15 19:25:54 UTC

Re: Phenotype-specific entities [SUSPICIOUS] [SUSPICIOUS]

Lol was just about to send this:
https://github.com/tmills/umls-graph-api

It points at your umls META directory, reads in the ctakes list of TUIs, and builds a neo4j graph database with all the ISA links, and has a simple API for getting parent/child CUIs.
I used it for coreference.
Tim

________________________________________
From: Finan, Sean <Se...@childrens.harvard.edu>
Sent: Wednesday, February 15, 2017 2:23 PM
To: dev@ctakes.apache.org
Subject: RE: Phenotype-specific entities [SUSPICIOUS] [SUSPICIOUS]

The dictionary gui doesn't walk the ontology.  There are umls tables that list relations, wherein things like "isa" (is a) relations may satisfy a hypernym requirement.  If you have the umls rrf files look at mrrel.rrf.  The structure is basically concept1|..| concept2|..|relationtype|..   See section 3.3.9: https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ncbi.nlm.nih.gov_books_NBK9685_&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=agwhlSpSUQ0H6VeJpnACDGcka2fVYSy3HaITKaJN9S8&s=aODQU20_0mAv1i_izwB3RZGOBB0U0ZucFkByxovUJJc&e=

If anybody has made anything that parses/uses umls relations and can be used by ctakes, please contribute!  Something like a simple traversable umls graphdb would be a great addition ...  Even if it is incomplete or rough, it could be a valuable seed for a new effort.

Sean

-----Original Message-----
From: Savova, Guergana [mailto:Guergana.Savova@childrens.harvard.edu]
Sent: Wednesday, February 15, 2017 1:54 PM
To: dev@ctakes.apache.org
Subject: RE: Phenotype-specific entities [SUSPICIOUS]

I don't believe there is a tool for walking the UMLS ontology, Dima. But Sean should confirm that his dictionary building tool does not have that functionality.

I think you can use the UMLS tables to get that information. It has been quite a while I have used these tables, but I remember I was able to get that information from them...

Sean,
Does your dictionary building tool implement ontology walking?

--Guergana

-----Original Message-----
From: Dligach, Dmitriy [mailto:ddligach@luc.edu]
Sent: Wednesday, February 15, 2017 1:50 PM
To: dev@ctakes.apache.org
Subject: Re: Phenotype-specific entities

Guergana, thank you.

Is there anything in cTAKES now for walking the UMLS ontology (e.g. for finding hypernyms, synonyms, etc.)?

Dima



> On Feb 15, 2017, at 12:45, Savova, Guergana <Gu...@childrens.harvard.edu> wrote:
>
> Hi Erin,
> Yes, creating your customized dictionary is the way to go. You can prune by semantic types of interest and then remove branches that are not relevant to your specific phenotype. I am not aware of cTAKES implementing such a tool for a very customized dictionary.
>
> You can also start with  a few terms that you know are relevant to your phenotype and then find their synonyms in the UMLS. Then, you can further walk a specific ontology and take siblings, parents if you think they are relevant.
>
> Then, there is the whole field of using word embeddings to find synonyms/related terms from unlabeled data  if you want to become really fancy :-) At this point, cTAKES does not implement any deep learning algorithms, in the future we are planning to release a bridge to KERAS.
>
> I hope this makes sense.
>
> --
> Guergana Savova, PhD, FACMI
> Associate Professor
> PI Natural Language Processing Lab
> Boston Children's Hospital and Harvard Medical School
> 300 Longwood Avenue
> Mailstop: BCH3092
> Enders 144.1
> Boston, MA 02115
> Tel: (617) 919-2972
> Fax: (617) 730-0817
> Guergana.Savova@childrens.harvard.edu
> Harvard Scholar: https://urldefense.proofpoint.com/v2/url?u=http-3A__scholar.harvard.edu_guergana-5Fk-5Fsavova_biocv&d=DwIFAw&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=EMsbVKH4fuTPUXGVRWfjw4vqV3ifyKdh-3K3OLUIogI&s=oAz3p_diNUmQdKL6UIfE9Vsnj1T4H5xq6CIof1jXisU&e=
> ctakes.apache.org
> thyme.healthnlp.org
> cancer.healthnlp.org
> share.healthnlp.org
>
>
> -----Original Message-----
> From: Erin Nicole Gustafson [mailto:erin.gustafson@northwestern.edu]
> Sent: Wednesday, February 15, 2017 1:38 PM
> To: dev@ctakes.apache.org
> Subject: Phenotype-specific entities
>
> Hi all,
>
> I would like to be able to only identify entities that are relevant for some specific phenotype. One step towards achieving this would be to build a custom dictionary with a limited set of semantic types. However, this is not quite specific enough to only identify mentions related to one disease while ignoring those related to some other disease, for example.
>
> Does cTAKES currently have a way to do this sort of filtering? Or, has anyone developed their own tools that they'd be willing to share?
>
> Thanks,
> Erin


Re: Phenotype-specific entities [SUSPICIOUS] [SUSPICIOUS]

Posted by "Dligach, Dmitriy" <dd...@luc.edu>.
Very nice! Thank you, Tim and Sean.

Dima



> On Feb 15, 2017, at 13:25, Miller, Timothy <Ti...@childrens.harvard.edu> wrote:
> 
> Lol was just about to send this:
> https://github.com/tmills/umls-graph-api
> 
> It points at your umls META directory, reads in the ctakes list of TUIs, and builds a neo4j graph database with all the ISA links, and has a simple API for getting parent/child CUIs.
> I used it for coreference.
> Tim
> 
> ________________________________________
> From: Finan, Sean <Se...@childrens.harvard.edu>
> Sent: Wednesday, February 15, 2017 2:23 PM
> To: dev@ctakes.apache.org
> Subject: RE: Phenotype-specific entities [SUSPICIOUS] [SUSPICIOUS]
> 
> The dictionary gui doesn't walk the ontology.  There are umls tables that list relations, wherein things like "isa" (is a) relations may satisfy a hypernym requirement.  If you have the umls rrf files look at mrrel.rrf.  The structure is basically concept1|..| concept2|..|relationtype|..   See section 3.3.9: https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ncbi.nlm.nih.gov_books_NBK9685_&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=agwhlSpSUQ0H6VeJpnACDGcka2fVYSy3HaITKaJN9S8&s=aODQU20_0mAv1i_izwB3RZGOBB0U0ZucFkByxovUJJc&e=
> 
> If anybody has made anything that parses/uses umls relations and can be used by ctakes, please contribute!  Something like a simple traversable umls graphdb would be a great addition ...  Even if it is incomplete or rough, it could be a valuable seed for a new effort.
> 
> Sean
> 
> -----Original Message-----
> From: Savova, Guergana [mailto:Guergana.Savova@childrens.harvard.edu]
> Sent: Wednesday, February 15, 2017 1:54 PM
> To: dev@ctakes.apache.org
> Subject: RE: Phenotype-specific entities [SUSPICIOUS]
> 
> I don't believe there is a tool for walking the UMLS ontology, Dima. But Sean should confirm that his dictionary building tool does not have that functionality.
> 
> I think you can use the UMLS tables to get that information. It has been quite a while I have used these tables, but I remember I was able to get that information from them...
> 
> Sean,
> Does your dictionary building tool implement ontology walking?
> 
> --Guergana
> 
> -----Original Message-----
> From: Dligach, Dmitriy [mailto:ddligach@luc.edu]
> Sent: Wednesday, February 15, 2017 1:50 PM
> To: dev@ctakes.apache.org
> Subject: Re: Phenotype-specific entities
> 
> Guergana, thank you.
> 
> Is there anything in cTAKES now for walking the UMLS ontology (e.g. for finding hypernyms, synonyms, etc.)?
> 
> Dima
> 
> 
> 
>> On Feb 15, 2017, at 12:45, Savova, Guergana <Gu...@childrens.harvard.edu> wrote:
>> 
>> Hi Erin,
>> Yes, creating your customized dictionary is the way to go. You can prune by semantic types of interest and then remove branches that are not relevant to your specific phenotype. I am not aware of cTAKES implementing such a tool for a very customized dictionary.
>> 
>> You can also start with  a few terms that you know are relevant to your phenotype and then find their synonyms in the UMLS. Then, you can further walk a specific ontology and take siblings, parents if you think they are relevant.
>> 
>> Then, there is the whole field of using word embeddings to find synonyms/related terms from unlabeled data  if you want to become really fancy :-) At this point, cTAKES does not implement any deep learning algorithms, in the future we are planning to release a bridge to KERAS.
>> 
>> I hope this makes sense.
>> 
>> --
>> Guergana Savova, PhD, FACMI
>> Associate Professor
>> PI Natural Language Processing Lab
>> Boston Children's Hospital and Harvard Medical School
>> 300 Longwood Avenue
>> Mailstop: BCH3092
>> Enders 144.1
>> Boston, MA 02115
>> Tel: (617) 919-2972
>> Fax: (617) 730-0817
>> Guergana.Savova@childrens.harvard.edu
>> Harvard Scholar: https://urldefense.proofpoint.com/v2/url?u=http-3A__scholar.harvard.edu_guergana-5Fk-5Fsavova_biocv&d=DwIFAw&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=EMsbVKH4fuTPUXGVRWfjw4vqV3ifyKdh-3K3OLUIogI&s=oAz3p_diNUmQdKL6UIfE9Vsnj1T4H5xq6CIof1jXisU&e=
>> ctakes.apache.org
>> thyme.healthnlp.org
>> cancer.healthnlp.org
>> share.healthnlp.org
>> 
>> 
>> -----Original Message-----
>> From: Erin Nicole Gustafson [mailto:erin.gustafson@northwestern.edu]
>> Sent: Wednesday, February 15, 2017 1:38 PM
>> To: dev@ctakes.apache.org
>> Subject: Phenotype-specific entities
>> 
>> Hi all,
>> 
>> I would like to be able to only identify entities that are relevant for some specific phenotype. One step towards achieving this would be to build a custom dictionary with a limited set of semantic types. However, this is not quite specific enough to only identify mentions related to one disease while ignoring those related to some other disease, for example.
>> 
>> Does cTAKES currently have a way to do this sort of filtering? Or, has anyone developed their own tools that they'd be willing to share?
>> 
>> Thanks,
>> Erin
> 


RE: Phenotype-specific entities [SUSPICIOUS] [SUSPICIOUS] [SUSPICIOUS]

Posted by Erin Nicole Gustafson <er...@northwestern.edu>.
Thanks, all!

-Erin


-----Original Message-----
From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu] 
Sent: Wednesday, February 15, 2017 1:28 PM
To: dev@ctakes.apache.org
Subject: RE: Phenotype-specific entities [SUSPICIOUS] [SUSPICIOUS] [SUSPICIOUS]

Hi Tim,
Lol rbay, I remembered your talking about this and wondered if you would bite!
Cheers!
Sean

-----Original Message-----
From: Miller, Timothy [mailto:Timothy.Miller@childrens.harvard.edu] 
Sent: Wednesday, February 15, 2017 2:26 PM
To: dev@ctakes.apache.org
Subject: Re: Phenotype-specific entities [SUSPICIOUS] [SUSPICIOUS] [SUSPICIOUS]

Lol was just about to send this:
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_tmills_umls-2Dgraph-2Dapi&d=DwIFAw&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=qRZhkS_DX1IuuFC4Yk5QiDUt-vqEIeGD7jH-6f_bOFg&s=WhgfFeTaNOm4ZqVuro3orZC12EMUcVlawyJy6RSpj9w&e= 

It points at your umls META directory, reads in the ctakes list of TUIs, and builds a neo4j graph database with all the ISA links, and has a simple API for getting parent/child CUIs.
I used it for coreference.
Tim

________________________________________
From: Finan, Sean <Se...@childrens.harvard.edu>
Sent: Wednesday, February 15, 2017 2:23 PM
To: dev@ctakes.apache.org
Subject: RE: Phenotype-specific entities [SUSPICIOUS] [SUSPICIOUS]

The dictionary gui doesn't walk the ontology.  There are umls tables that list relations, wherein things like "isa" (is a) relations may satisfy a hypernym requirement.  If you have the umls rrf files look at mrrel.rrf.  The structure is basically concept1|..| concept2|..|relationtype|..   See section 3.3.9: https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ncbi.nlm.nih.gov_books_NBK9685_&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=agwhlSpSUQ0H6VeJpnACDGcka2fVYSy3HaITKaJN9S8&s=aODQU20_0mAv1i_izwB3RZGOBB0U0ZucFkByxovUJJc&e=

If anybody has made anything that parses/uses umls relations and can be used by ctakes, please contribute!  Something like a simple traversable umls graphdb would be a great addition ...  Even if it is incomplete or rough, it could be a valuable seed for a new effort.

Sean

-----Original Message-----
From: Savova, Guergana [mailto:Guergana.Savova@childrens.harvard.edu]
Sent: Wednesday, February 15, 2017 1:54 PM
To: dev@ctakes.apache.org
Subject: RE: Phenotype-specific entities [SUSPICIOUS]

I don't believe there is a tool for walking the UMLS ontology, Dima. But Sean should confirm that his dictionary building tool does not have that functionality.

I think you can use the UMLS tables to get that information. It has been quite a while I have used these tables, but I remember I was able to get that information from them...

Sean,
Does your dictionary building tool implement ontology walking?

--Guergana

-----Original Message-----
From: Dligach, Dmitriy [mailto:ddligach@luc.edu]
Sent: Wednesday, February 15, 2017 1:50 PM
To: dev@ctakes.apache.org
Subject: Re: Phenotype-specific entities

Guergana, thank you.

Is there anything in cTAKES now for walking the UMLS ontology (e.g. for finding hypernyms, synonyms, etc.)?

Dima



> On Feb 15, 2017, at 12:45, Savova, Guergana <Gu...@childrens.harvard.edu> wrote:
>
> Hi Erin,
> Yes, creating your customized dictionary is the way to go. You can prune by semantic types of interest and then remove branches that are not relevant to your specific phenotype. I am not aware of cTAKES implementing such a tool for a very customized dictionary.
>
> You can also start with  a few terms that you know are relevant to your phenotype and then find their synonyms in the UMLS. Then, you can further walk a specific ontology and take siblings, parents if you think they are relevant.
>
> Then, there is the whole field of using word embeddings to find synonyms/related terms from unlabeled data  if you want to become really fancy :-) At this point, cTAKES does not implement any deep learning algorithms, in the future we are planning to release a bridge to KERAS.
>
> I hope this makes sense.
>
> --
> Guergana Savova, PhD, FACMI
> Associate Professor
> PI Natural Language Processing Lab
> Boston Children's Hospital and Harvard Medical School
> 300 Longwood Avenue
> Mailstop: BCH3092
> Enders 144.1
> Boston, MA 02115
> Tel: (617) 919-2972
> Fax: (617) 730-0817
> Guergana.Savova@childrens.harvard.edu
> Harvard Scholar: https://urldefense.proofpoint.com/v2/url?u=http-3A__scholar.harvard.edu_guergana-5Fk-5Fsavova_biocv&d=DwIFAw&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=EMsbVKH4fuTPUXGVRWfjw4vqV3ifyKdh-3K3OLUIogI&s=oAz3p_diNUmQdKL6UIfE9Vsnj1T4H5xq6CIof1jXisU&e=
> ctakes.apache.org
> thyme.healthnlp.org
> cancer.healthnlp.org
> share.healthnlp.org
>
>
> -----Original Message-----
> From: Erin Nicole Gustafson [mailto:erin.gustafson@northwestern.edu]
> Sent: Wednesday, February 15, 2017 1:38 PM
> To: dev@ctakes.apache.org
> Subject: Phenotype-specific entities
>
> Hi all,
>
> I would like to be able to only identify entities that are relevant for some specific phenotype. One step towards achieving this would be to build a custom dictionary with a limited set of semantic types. However, this is not quite specific enough to only identify mentions related to one disease while ignoring those related to some other disease, for example.
>
> Does cTAKES currently have a way to do this sort of filtering? Or, has anyone developed their own tools that they'd be willing to share?
>
> Thanks,
> Erin


RE: Phenotype-specific entities [SUSPICIOUS] [SUSPICIOUS] [SUSPICIOUS]

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Hi Tim,
Lol rbay, I remembered your talking about this and wondered if you would bite!
Cheers!
Sean

-----Original Message-----
From: Miller, Timothy [mailto:Timothy.Miller@childrens.harvard.edu] 
Sent: Wednesday, February 15, 2017 2:26 PM
To: dev@ctakes.apache.org
Subject: Re: Phenotype-specific entities [SUSPICIOUS] [SUSPICIOUS] [SUSPICIOUS]

Lol was just about to send this:
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_tmills_umls-2Dgraph-2Dapi&d=DwIFAw&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=qRZhkS_DX1IuuFC4Yk5QiDUt-vqEIeGD7jH-6f_bOFg&s=WhgfFeTaNOm4ZqVuro3orZC12EMUcVlawyJy6RSpj9w&e= 

It points at your umls META directory, reads in the ctakes list of TUIs, and builds a neo4j graph database with all the ISA links, and has a simple API for getting parent/child CUIs.
I used it for coreference.
Tim

________________________________________
From: Finan, Sean <Se...@childrens.harvard.edu>
Sent: Wednesday, February 15, 2017 2:23 PM
To: dev@ctakes.apache.org
Subject: RE: Phenotype-specific entities [SUSPICIOUS] [SUSPICIOUS]

The dictionary gui doesn't walk the ontology.  There are umls tables that list relations, wherein things like "isa" (is a) relations may satisfy a hypernym requirement.  If you have the umls rrf files look at mrrel.rrf.  The structure is basically concept1|..| concept2|..|relationtype|..   See section 3.3.9: https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ncbi.nlm.nih.gov_books_NBK9685_&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=agwhlSpSUQ0H6VeJpnACDGcka2fVYSy3HaITKaJN9S8&s=aODQU20_0mAv1i_izwB3RZGOBB0U0ZucFkByxovUJJc&e=

If anybody has made anything that parses/uses umls relations and can be used by ctakes, please contribute!  Something like a simple traversable umls graphdb would be a great addition ...  Even if it is incomplete or rough, it could be a valuable seed for a new effort.

Sean

-----Original Message-----
From: Savova, Guergana [mailto:Guergana.Savova@childrens.harvard.edu]
Sent: Wednesday, February 15, 2017 1:54 PM
To: dev@ctakes.apache.org
Subject: RE: Phenotype-specific entities [SUSPICIOUS]

I don't believe there is a tool for walking the UMLS ontology, Dima. But Sean should confirm that his dictionary building tool does not have that functionality.

I think you can use the UMLS tables to get that information. It has been quite a while I have used these tables, but I remember I was able to get that information from them...

Sean,
Does your dictionary building tool implement ontology walking?

--Guergana

-----Original Message-----
From: Dligach, Dmitriy [mailto:ddligach@luc.edu]
Sent: Wednesday, February 15, 2017 1:50 PM
To: dev@ctakes.apache.org
Subject: Re: Phenotype-specific entities

Guergana, thank you.

Is there anything in cTAKES now for walking the UMLS ontology (e.g. for finding hypernyms, synonyms, etc.)?

Dima



> On Feb 15, 2017, at 12:45, Savova, Guergana <Gu...@childrens.harvard.edu> wrote:
>
> Hi Erin,
> Yes, creating your customized dictionary is the way to go. You can prune by semantic types of interest and then remove branches that are not relevant to your specific phenotype. I am not aware of cTAKES implementing such a tool for a very customized dictionary.
>
> You can also start with  a few terms that you know are relevant to your phenotype and then find their synonyms in the UMLS. Then, you can further walk a specific ontology and take siblings, parents if you think they are relevant.
>
> Then, there is the whole field of using word embeddings to find synonyms/related terms from unlabeled data  if you want to become really fancy :-) At this point, cTAKES does not implement any deep learning algorithms, in the future we are planning to release a bridge to KERAS.
>
> I hope this makes sense.
>
> --
> Guergana Savova, PhD, FACMI
> Associate Professor
> PI Natural Language Processing Lab
> Boston Children's Hospital and Harvard Medical School
> 300 Longwood Avenue
> Mailstop: BCH3092
> Enders 144.1
> Boston, MA 02115
> Tel: (617) 919-2972
> Fax: (617) 730-0817
> Guergana.Savova@childrens.harvard.edu
> Harvard Scholar: https://urldefense.proofpoint.com/v2/url?u=http-3A__scholar.harvard.edu_guergana-5Fk-5Fsavova_biocv&d=DwIFAw&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=EMsbVKH4fuTPUXGVRWfjw4vqV3ifyKdh-3K3OLUIogI&s=oAz3p_diNUmQdKL6UIfE9Vsnj1T4H5xq6CIof1jXisU&e=
> ctakes.apache.org
> thyme.healthnlp.org
> cancer.healthnlp.org
> share.healthnlp.org
>
>
> -----Original Message-----
> From: Erin Nicole Gustafson [mailto:erin.gustafson@northwestern.edu]
> Sent: Wednesday, February 15, 2017 1:38 PM
> To: dev@ctakes.apache.org
> Subject: Phenotype-specific entities
>
> Hi all,
>
> I would like to be able to only identify entities that are relevant for some specific phenotype. One step towards achieving this would be to build a custom dictionary with a limited set of semantic types. However, this is not quite specific enough to only identify mentions related to one disease while ignoring those related to some other disease, for example.
>
> Does cTAKES currently have a way to do this sort of filtering? Or, has anyone developed their own tools that they'd be willing to share?
>
> Thanks,
> Erin