You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ctakes.apache.org by Erick Velazquez <er...@gmail.com> on 2018/01/22 16:14:40 UTC

Building a dictionary from ontologies

Hello, 

I’m building a dictionary from an ontology (OWL), but there is no CUI, neither TUI in the information. Since the format of a dictionary in cTAKES is CUI | TUI | TEXT, or CUI | TEXT, is there any specification to create CUIs for terms? 
Thanks, 

Erick 

RE: Building a dictionary from ontologies [EXTERNAL]

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
I think that there is an example lookup xml file in the dictionary/fast/examples/

Basically you want to copy one of those and just point it to your bsv file.
Then in your pipeline you want to specify the "LookupXml" to point to that xml file.  You can do this with a -l if you are running the default pipeline command-line script.  Or if you run the piper gui you can point to it there.  That is probably the easiest thing for a new user.  That also allows you to save your setup.

Sean

-----Original Message-----
From: Erick Velazquez [mailto:erick.lerouge@gmail.com] 
Sent: Wednesday, January 24, 2018 11:23 AM
To: dev@ctakes.apache.org
Subject: Re: Building a dictionary from ontologies [EXTERNAL]

Hi Sean, 
Thanks for your help. So now I got my BSV file, but I don’t find the documentation that explains how to include it into the cTAKES analysis. Is there any document that can help me?
Kind regards, 

Erick 

> On Jan 23, 2018, at 5:21 PM, Finan, Sean <Se...@childrens.harvard.edu> wrote:
> 
> Hi Erick,
> 
> Each synonym gets a single line in the bsv file.  So:
> 
> OWL00770 | TOOO | Right parietal lobe
> OWL00770 | TOOO | parietal lobe, right
> OWL00770 | TOOO | parietal lobe on the right
> 
> If you don't have a tui then you can simplify the lines by leaving the second column empty:
> 
> OWL00770 | | Right parietal lobe
> OWL00770 | | parietal lobe, right
> OWL00770 | | parietal lobe on the right
> 
> 
> Sean
> 
> -----Original Message-----
> From: Erick Velazquez [mailto:erick.lerouge@gmail.com] 
> Sent: Tuesday, January 23, 2018 5:15 PM
> To: dev@ctakes.apache.org
> Subject: Re: Building a dictionary from ontologies [EXTERNAL]
> 
> Hi Sean,
> 
> Thank you for your answer!
> 
> I would like to show you my results. As an example, I got this:
> 
> 
> 
> OWL00770 | TOOO | Right parietal lobe
> 
> 
> 
> The third column is the text.
> 
> You suggested to me to use the uri but what I get from the ontology is only a web link. Then, I don’t use the preferred text option.
> 
> When you say that text should also contain synonyms what do you mean? That means that every token in the text column is considered as a synonym? Then in my example, right would be interpreted as a synonym of parietal and lob?
> 
> 
> 
> Kind regards,
> 
> 
> 
> Erick Velazquez 
> 
> 
>> On Jan 22, 2018, at 2:47 PM, Finan, Sean <Se...@childrens.harvard.edu> wrote:
>> 
>> Hi Erick,
>> 
>> There is a fourth option that should work
>> 
>> cui | tui | text | preferredText
>> 
>> I would create an importer that creates a -fake- cui.  The cui need not (in this case should not) start with 'C'.  So, I would import per-owl uri using something like OWL00001.  
>> 
>> tui can be empty, in which case "T000" will be used, =forcing ctakes to create annotations of unknown semantic type.  
>> 
>> text(s) should contain your synonym(s).
>> 
>> preferredText can be your owl uri.
>> 
>> This should allow you to fake it with an imported owl.  Upon deconstruction of the cas you will want to look at the preferredTerm for each annotation and ignore the cui and tui.
>> 
>> Sean 
>> 
>> ________________________________________
>> From: Erick Velazquez <er...@gmail.com>
>> Sent: Monday, January 22, 2018 11:14 AM
>> To: dev@ctakes.apache.org
>> Subject: Building a dictionary from ontologies  [EXTERNAL]
>> 
>> Hello,
>> 
>> I’m building a dictionary from an ontology (OWL), but there is no CUI, neither TUI in the information. Since the format of a dictionary in cTAKES is CUI | TUI | TEXT, or CUI | TEXT, is there any specification to create CUIs for terms?
>> Thanks,
>> 
>> Erick
> 


Re: Building a dictionary from ontologies [EXTERNAL]

Posted by Erick Velazquez <er...@gmail.com>.
Hi Sean, 
Thanks for your help. So now I got my BSV file, but I don’t find the documentation that explains how to include it into the cTAKES analysis. Is there any document that can help me?
Kind regards, 

Erick 

> On Jan 23, 2018, at 5:21 PM, Finan, Sean <Se...@childrens.harvard.edu> wrote:
> 
> Hi Erick,
> 
> Each synonym gets a single line in the bsv file.  So:
> 
> OWL00770 | TOOO | Right parietal lobe
> OWL00770 | TOOO | parietal lobe, right
> OWL00770 | TOOO | parietal lobe on the right
> 
> If you don't have a tui then you can simplify the lines by leaving the second column empty:
> 
> OWL00770 | | Right parietal lobe
> OWL00770 | | parietal lobe, right
> OWL00770 | | parietal lobe on the right
> 
> 
> Sean
> 
> -----Original Message-----
> From: Erick Velazquez [mailto:erick.lerouge@gmail.com] 
> Sent: Tuesday, January 23, 2018 5:15 PM
> To: dev@ctakes.apache.org
> Subject: Re: Building a dictionary from ontologies [EXTERNAL]
> 
> Hi Sean,
> 
> Thank you for your answer!
> 
> I would like to show you my results. As an example, I got this:
> 
> 
> 
> OWL00770 | TOOO | Right parietal lobe
> 
> 
> 
> The third column is the text.
> 
> You suggested to me to use the uri but what I get from the ontology is only a web link. Then, I don’t use the preferred text option.
> 
> When you say that text should also contain synonyms what do you mean? That means that every token in the text column is considered as a synonym? Then in my example, right would be interpreted as a synonym of parietal and lob?
> 
> 
> 
> Kind regards,
> 
> 
> 
> Erick Velazquez 
> 
> 
>> On Jan 22, 2018, at 2:47 PM, Finan, Sean <Se...@childrens.harvard.edu> wrote:
>> 
>> Hi Erick,
>> 
>> There is a fourth option that should work
>> 
>> cui | tui | text | preferredText
>> 
>> I would create an importer that creates a -fake- cui.  The cui need not (in this case should not) start with 'C'.  So, I would import per-owl uri using something like OWL00001.  
>> 
>> tui can be empty, in which case "T000" will be used, =forcing ctakes to create annotations of unknown semantic type.  
>> 
>> text(s) should contain your synonym(s).
>> 
>> preferredText can be your owl uri.
>> 
>> This should allow you to fake it with an imported owl.  Upon deconstruction of the cas you will want to look at the preferredTerm for each annotation and ignore the cui and tui.
>> 
>> Sean 
>> 
>> ________________________________________
>> From: Erick Velazquez <er...@gmail.com>
>> Sent: Monday, January 22, 2018 11:14 AM
>> To: dev@ctakes.apache.org
>> Subject: Building a dictionary from ontologies  [EXTERNAL]
>> 
>> Hello,
>> 
>> I’m building a dictionary from an ontology (OWL), but there is no CUI, neither TUI in the information. Since the format of a dictionary in cTAKES is CUI | TUI | TEXT, or CUI | TEXT, is there any specification to create CUIs for terms?
>> Thanks,
>> 
>> Erick
> 


RE: Building a dictionary from ontologies [EXTERNAL]

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Hi Erick,

Each synonym gets a single line in the bsv file.  So:

OWL00770 | TOOO | Right parietal lobe
OWL00770 | TOOO | parietal lobe, right
OWL00770 | TOOO | parietal lobe on the right

If you don't have a tui then you can simplify the lines by leaving the second column empty:

OWL00770 | | Right parietal lobe
OWL00770 | | parietal lobe, right
OWL00770 | | parietal lobe on the right


Sean

-----Original Message-----
From: Erick Velazquez [mailto:erick.lerouge@gmail.com] 
Sent: Tuesday, January 23, 2018 5:15 PM
To: dev@ctakes.apache.org
Subject: Re: Building a dictionary from ontologies [EXTERNAL]

Hi Sean,

Thank you for your answer!

I would like to show you my results. As an example, I got this:

 

OWL00770 | TOOO | Right parietal lobe

 

The third column is the text.

You suggested to me to use the uri but what I get from the ontology is only a web link. Then, I don’t use the preferred text option.

When you say that text should also contain synonyms what do you mean? That means that every token in the text column is considered as a synonym? Then in my example, right would be interpreted as a synonym of parietal and lob?

 

Kind regards,

 

Erick Velazquez 


> On Jan 22, 2018, at 2:47 PM, Finan, Sean <Se...@childrens.harvard.edu> wrote:
> 
> Hi Erick,
> 
> There is a fourth option that should work
> 
> cui | tui | text | preferredText
> 
> I would create an importer that creates a -fake- cui.  The cui need not (in this case should not) start with 'C'.  So, I would import per-owl uri using something like OWL00001.  
> 
> tui can be empty, in which case "T000" will be used, =forcing ctakes to create annotations of unknown semantic type.  
> 
> text(s) should contain your synonym(s).
> 
> preferredText can be your owl uri.
> 
> This should allow you to fake it with an imported owl.  Upon deconstruction of the cas you will want to look at the preferredTerm for each annotation and ignore the cui and tui.
> 
> Sean 
> 
> ________________________________________
> From: Erick Velazquez <er...@gmail.com>
> Sent: Monday, January 22, 2018 11:14 AM
> To: dev@ctakes.apache.org
> Subject: Building a dictionary from ontologies  [EXTERNAL]
> 
> Hello,
> 
> I’m building a dictionary from an ontology (OWL), but there is no CUI, neither TUI in the information. Since the format of a dictionary in cTAKES is CUI | TUI | TEXT, or CUI | TEXT, is there any specification to create CUIs for terms?
> Thanks,
> 
> Erick


Re: Building a dictionary from ontologies [EXTERNAL]

Posted by Erick Velazquez <er...@gmail.com>.
Hi Sean,

Thank you for your answer!

I would like to show you my results. As an example, I got this:

 

OWL00770 | TOOO | Right parietal lobe

 

The third column is the text.

You suggested to me to use the uri but what I get from the ontology is only a web link. Then, I don’t use the preferred text option.

When you say that text should also contain synonyms what do you mean? That means that every token in the text column is considered as a synonym? Then in my example, right would be interpreted as a synonym of parietal and lob?

 

Kind regards,

 

Erick Velazquez 


> On Jan 22, 2018, at 2:47 PM, Finan, Sean <Se...@childrens.harvard.edu> wrote:
> 
> Hi Erick,
> 
> There is a fourth option that should work
> 
> cui | tui | text | preferredText
> 
> I would create an importer that creates a -fake- cui.  The cui need not (in this case should not) start with 'C'.  So, I would import per-owl uri using something like OWL00001.  
> 
> tui can be empty, in which case "T000" will be used, =forcing ctakes to create annotations of unknown semantic type.  
> 
> text(s) should contain your synonym(s).
> 
> preferredText can be your owl uri.
> 
> This should allow you to fake it with an imported owl.  Upon deconstruction of the cas you will want to look at the preferredTerm for each annotation and ignore the cui and tui.
> 
> Sean 
> 
> ________________________________________
> From: Erick Velazquez <er...@gmail.com>
> Sent: Monday, January 22, 2018 11:14 AM
> To: dev@ctakes.apache.org
> Subject: Building a dictionary from ontologies  [EXTERNAL]
> 
> Hello,
> 
> I’m building a dictionary from an ontology (OWL), but there is no CUI, neither TUI in the information. Since the format of a dictionary in cTAKES is CUI | TUI | TEXT, or CUI | TEXT, is there any specification to create CUIs for terms?
> Thanks,
> 
> Erick


Re: Building a dictionary from ontologies [EXTERNAL]

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Hi Erick,

There is a fourth option that should work

cui | tui | text | preferredText

I would create an importer that creates a -fake- cui.  The cui need not (in this case should not) start with 'C'.  So, I would import per-owl uri using something like OWL00001.  
 
tui can be empty, in which case "T000" will be used, =forcing ctakes to create annotations of unknown semantic type.  

text(s) should contain your synonym(s).

preferredText can be your owl uri.

This should allow you to fake it with an imported owl.  Upon deconstruction of the cas you will want to look at the preferredTerm for each annotation and ignore the cui and tui.

Sean 

________________________________________
From: Erick Velazquez <er...@gmail.com>
Sent: Monday, January 22, 2018 11:14 AM
To: dev@ctakes.apache.org
Subject: Building a dictionary from ontologies  [EXTERNAL]

Hello,

I’m building a dictionary from an ontology (OWL), but there is no CUI, neither TUI in the information. Since the format of a dictionary in cTAKES is CUI | TUI | TEXT, or CUI | TEXT, is there any specification to create CUIs for terms?
Thanks,

Erick