You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ctakes.apache.org by Johanne Krogsgaard Jensen <jj...@student.aau.dk> on 2021/04/09 09:16:17 UTC

Accessing concepts for use of Preffered Terms

Hallo everyone,

We are aiming at integrating post-coordinated SNOMED CT-terms into a version of cTAKES. For this to happen we need to access the concepts (SNOMED CT-terms) which cTAKES generates as default. As for now we've worked out of the process-function in the file AbstractJCasTermAnnotator.java. Here we try to access the CollectionMap allConceptsMap, which to our understanding include all the concepts generated by cTAKES (is this correct?). We are able to get the keys and the one of the collections using the following:

System.out.println(allConceptsMap.keySet());
System.out.println(allConceptsMap.getCollection(key));

Which result in the following:
[30193, 30705, 16658, 231749, 678226, 4057, 3263723, 262926, 1963703, 3263722]
[org.apache.ctakes.dictionary.lookup2.concept.DefaultConcept@6fe7b7eb]

Our main question is how to access this: [org.apache.ctakes.dictionary.lookup2.concept.DefaultConcept@6fe7b7eb] in order to obtain preferred terms and CUI codes?
One of our concerns is that we either figure out the type of the at concepts and collections.

We have also given the CAS a try, but didn't quite understand it.

Thanks you in advance!

Best regard
Thea Mentz and Johanne Krogsgaard

[cid:image001.png@01D72D31.1D32E0C0]



RE: Accessing concepts for use of Preffered Terms [EXTERNAL]

Posted by Johanne Krogsgaard Jensen <jj...@student.aau.dk>.
Hi Sean

I now have had time to go through it, and thank you! It was a huge help for newbies like us.

Best regard
Thea Mentz and Johanne Krogsgaard

-----Original Message-----
From: Finan, Sean <Se...@childrens.harvard.edu> 
Sent: 09 April 2021 15:05
To: dev@ctakes.apache.org
Subject: Re: Accessing concepts for use of Preffered Terms [EXTERNAL]

Hi Thea and Johanne,


I have a "easy button" recommendation at the end.  Feel free to skip to that if you don't want to look at utility methods and just want to see some example code.


> Here we try to access the CollectionMap allConceptsMap, which to our understanding include all the concepts generated by cTAKES (is this correct?).

-- While not technically incorrect, it is unusual.  Normally one would not reference code within another annotation engine, but would obtain information stored in the jcas.

I tend to fetch umls coded things in two manners.

1.  Use the uimafit utility JCasUtil.  A JCasUtil.select( jcas, IdentifiedAnnotation.class ) will get you everything that has UmlsConcept information.

http://javadox.com/org.apache.uima/uimafit-core/2.1.0/org/apache/uima/fit/util/JCasUtil.html


2.  Use the ctakes utility OntologyConceptUtil.  There are a lot of methods that can help.

http://ctakes.apache.org/apidocs/4.0.0/org/apache/ctakes/core/util/OntologyConceptUtil.html


You can use the two together.  For instance, obtain IdentifiedAnnotations using JCasUtil and then codes within them using the OntologyConceptUtil.


If you are using the unreleased version 4.1 in trunk then you can use the IdentifiedAnnotationUtil to get information about a single IdentifiedAnnotation.  Methods concerning codes actually delegate to the OntologyConceptUtil.

https://svn.apache.org/repos/asf/ctakes/trunk/ctakes-core/src/main/java/org/apache/ctakes/core/util/annotation/IdentifiedAnnotationUtil.java



>System.out.println(allConceptsMap.keySet());
System.out.println(allConceptsMap.getCollection(key));

Consider OntologyConceptUtil.getCodes( jcas, key )
http://ctakes.apache.org/apidocs/4.0.0/org/apache/ctakes/core/util/OntologyConceptUtil.html#getCodes-org.apache.uima.jcas.JCas-java.lang.String-

or OntologyConceptUtil.getSchemeCodes( jcas )
http://ctakes.apache.org/apidocs/4.0.0/org/apache/ctakes/core/util/OntologyConceptUtil.html#getSchemeCodes-org.apache.uima.jcas.JCas-

>Our main question is how to access this: [org.apache.ctakes.dictionary.lookup2.concept.DefaultConcept@6fe7b7eb] in order to obtain preferred terms and CUI codes?
One method would be OntologyConceptUtil.getCuis( jcas ) http://ctakes.apache.org/apidocs/4.0.0/org/apache/ctakes/core/util/OntologyConceptUtil.html#getCuis-org.apache.uima.jcas.JCas-​

>One of our concerns is that we either figure out the type of the at concepts and collections.

>We have also given the CAS a try, but didn’t quite understand it.


Here is my "easy button" recommendation.

You are going to create your own annotator.  Check this youtube video: https://www.youtube.com/watch?v=NLEak_9VMbQ​

There are adverts and emptiness in the beginning, then some boring basic info.  Skip to 8:30 and you will start to see information on how ctakes pipelines work.  After that there are a few examples of writing annotation engines for ctakes.  That is what you want to learn.  The speaker didn't do any prep run-through and it really shows, but the contents are decent.  It is long and boring but stick with it.


1.  Use the developer version of ctakes.  version 4.1 , trunk, whatever you want to call it.  It has the latest and greatest code.

2.  Make a copy of the piper file ctakes-clinical-pipeline-res/src/main/resources/org/apache/ctakes/clinical/pipeline/DefaultFastPipeline.piper

https://cwiki.apache.org/confluence/display/CTAKES/Piper+Files

3.  At the end of your copy, append the line "add SemanticTableFileWriter TableType=HTML"

4.  Run your new pipeline.

Halfway down the piper files wiki page there are 4 different methods for running piper files.  The easiest for a newby might be the piper file submitter gui

https://cwiki.apache.org/confluence/display/CTAKES/Piper+File+Submitter+GUI


After you run your pipeline, you should have html files in your output directory.  These were created by the SemanticTableFileWriter that you added to your piper file.

Each html file contains very simple table listing identified concepts and information about them including CUI and Preferred Text.


You can look at the code of SemanticTableFileWriter.java to see exactly how it is getting this information.

https://svn.apache.org/repos/asf/ctakes/trunk/ctakes-core/src/main/java/org/apache/ctakes/core/cc/SemanticTableFileWriter.java

Look at the code in getDataRows( jcas ) and AnnotationInfo( tui, section, annotation ).  Some lines in there probably do exactly what you want.

​

* you can also produce comma-separated, bar-separated, and tab-separated tables.



Sean








________________________________
From: Johanne Krogsgaard Jensen <jj...@student.aau.dk>
Sent: Friday, April 9, 2021 5:16 AM
To: dev@ctakes.apache.org
Subject: Accessing concepts for use of Preffered Terms [EXTERNAL]

* External Email - Caution *


Hallo everyone,

We are aiming at integrating post-coordinated SNOMED CT-terms into a version of cTAKES. For this to happen we need to access the concepts (SNOMED CT-terms) which cTAKES generates as default. As for now we’ve worked out of the process-function in the file AbstractJCasTermAnnotator.java. Here we try to access the CollectionMap allConceptsMap, which to our understanding include all the concepts generated by cTAKES (is this correct?). We are able to get the keys and the one of the collections using the following:

System.out.println(allConceptsMap.keySet());
System.out.println(allConceptsMap.getCollection(key));

Which result in the following:
[30193, 30705, 16658, 231749, 678226, 4057, 3263723, 262926, 1963703, 3263722] [org.apache.ctakes.dictionary.lookup2.concept.DefaultConcept@6fe7b7eb]

Our main question is how to access this: [org.apache.ctakes.dictionary.lookup2.concept.DefaultConcept@6fe7b7eb] in order to obtain preferred terms and CUI codes?
One of our concerns is that we either figure out the type of the at concepts and collections.

We have also given the CAS a try, but didn’t quite understand it.

Thanks you in advance!

Best regard
Thea Mentz and Johanne Krogsgaard

[cid:image001.png@01D72D31.1D32E0C0]



Re: Accessing concepts for use of Preffered Terms [EXTERNAL]

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Hi Thea and Johanne,


I have a "easy button" recommendation at the end.  Feel free to skip to that if you don't want to look at utility methods and just want to see some example code.


> Here we try to access the CollectionMap allConceptsMap, which to our understanding include all the concepts generated by cTAKES (is this correct?).

-- While not technically incorrect, it is unusual.  Normally one would not reference code within another annotation engine, but would obtain information stored in the jcas.

I tend to fetch umls coded things in two manners.

1.  Use the uimafit utility JCasUtil.  A JCasUtil.select( jcas, IdentifiedAnnotation.class ) will get you everything that has UmlsConcept information.

http://javadox.com/org.apache.uima/uimafit-core/2.1.0/org/apache/uima/fit/util/JCasUtil.html


2.  Use the ctakes utility OntologyConceptUtil.  There are a lot of methods that can help.

http://ctakes.apache.org/apidocs/4.0.0/org/apache/ctakes/core/util/OntologyConceptUtil.html


You can use the two together.  For instance, obtain IdentifiedAnnotations using JCasUtil and then codes within them using the OntologyConceptUtil.


If you are using the unreleased version 4.1 in trunk then you can use the IdentifiedAnnotationUtil to get information about a single IdentifiedAnnotation.  Methods concerning codes actually delegate to the OntologyConceptUtil.

https://svn.apache.org/repos/asf/ctakes/trunk/ctakes-core/src/main/java/org/apache/ctakes/core/util/annotation/IdentifiedAnnotationUtil.java



>System.out.println(allConceptsMap.keySet());
System.out.println(allConceptsMap.getCollection(key));

Consider OntologyConceptUtil.getCodes( jcas, key )
http://ctakes.apache.org/apidocs/4.0.0/org/apache/ctakes/core/util/OntologyConceptUtil.html#getCodes-org.apache.uima.jcas.JCas-java.lang.String-

or OntologyConceptUtil.getSchemeCodes( jcas )
http://ctakes.apache.org/apidocs/4.0.0/org/apache/ctakes/core/util/OntologyConceptUtil.html#getSchemeCodes-org.apache.uima.jcas.JCas-

>Our main question is how to access this: [org.apache.ctakes.dictionary.lookup2.concept.DefaultConcept@6fe7b7eb] in order to obtain preferred terms and CUI codes?
One method would be OntologyConceptUtil.getCuis( jcas )
http://ctakes.apache.org/apidocs/4.0.0/org/apache/ctakes/core/util/OntologyConceptUtil.html#getCuis-org.apache.uima.jcas.JCas-​

>One of our concerns is that we either figure out the type of the at concepts and collections.

>We have also given the CAS a try, but didn’t quite understand it.


Here is my "easy button" recommendation.

You are going to create your own annotator.  Check this youtube video: https://www.youtube.com/watch?v=NLEak_9VMbQ​

There are adverts and emptiness in the beginning, then some boring basic info.  Skip to 8:30 and you will start to see information on how ctakes pipelines work.  After that there are a few examples of writing annotation engines for ctakes.  That is what you want to learn.  The speaker didn't do any prep run-through and it really shows, but the contents are decent.  It is long and boring but stick with it.


1.  Use the developer version of ctakes.  version 4.1 , trunk, whatever you want to call it.  It has the latest and greatest code.

2.  Make a copy of the piper file ctakes-clinical-pipeline-res/src/main/resources/org/apache/ctakes/clinical/pipeline/DefaultFastPipeline.piper

https://cwiki.apache.org/confluence/display/CTAKES/Piper+Files

3.  At the end of your copy, append the line "add SemanticTableFileWriter TableType=HTML"

4.  Run your new pipeline.

Halfway down the piper files wiki page there are 4 different methods for running piper files.  The easiest for a newby might be the piper file submitter gui

https://cwiki.apache.org/confluence/display/CTAKES/Piper+File+Submitter+GUI


After you run your pipeline, you should have html files in your output directory.  These were created by the SemanticTableFileWriter that you added to your piper file.

Each html file contains very simple table listing identified concepts and information about them including CUI and Preferred Text.


You can look at the code of SemanticTableFileWriter.java to see exactly how it is getting this information.

https://svn.apache.org/repos/asf/ctakes/trunk/ctakes-core/src/main/java/org/apache/ctakes/core/cc/SemanticTableFileWriter.java

Look at the code in getDataRows( jcas ) and AnnotationInfo( tui, section, annotation ).  Some lines in there probably do exactly what you want.

​

* you can also produce comma-separated, bar-separated, and tab-separated tables.



Sean








________________________________
From: Johanne Krogsgaard Jensen <jj...@student.aau.dk>
Sent: Friday, April 9, 2021 5:16 AM
To: dev@ctakes.apache.org
Subject: Accessing concepts for use of Preffered Terms [EXTERNAL]

* External Email - Caution *


Hallo everyone,

We are aiming at integrating post-coordinated SNOMED CT-terms into a version of cTAKES. For this to happen we need to access the concepts (SNOMED CT-terms) which cTAKES generates as default. As for now we’ve worked out of the process-function in the file AbstractJCasTermAnnotator.java. Here we try to access the CollectionMap allConceptsMap, which to our understanding include all the concepts generated by cTAKES (is this correct?). We are able to get the keys and the one of the collections using the following:

System.out.println(allConceptsMap.keySet());
System.out.println(allConceptsMap.getCollection(key));

Which result in the following:
[30193, 30705, 16658, 231749, 678226, 4057, 3263723, 262926, 1963703, 3263722]
[org.apache.ctakes.dictionary.lookup2.concept.DefaultConcept@6fe7b7eb]

Our main question is how to access this: [org.apache.ctakes.dictionary.lookup2.concept.DefaultConcept@6fe7b7eb] in order to obtain preferred terms and CUI codes?
One of our concerns is that we either figure out the type of the at concepts and collections.

We have also given the CAS a try, but didn’t quite understand it.

Thanks you in advance!

Best regard
Thea Mentz and Johanne Krogsgaard

[cid:image001.png@01D72D31.1D32E0C0]