You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ctakes.apache.org by "Finan, Sean" <Se...@childrens.harvard.edu> on 2017/08/21 18:37:27 UTC
RE: Clinical documents section heading recognizer [EXTERNAL]
Hi Sekhar,
You can add the RegexSectionizer in ctakes-core to your pipeline, then code such as:
final Collection<Segment> sections = JCasUtil.select( jCas, Segment.class );
for ( Segment section : sections ) {
System.out.println( section.getId() );
System.out.println( section.getCoveredText() );
}
The section names (id) come from a file listing names and regular expressions in ctakes-core-res org.apache.ctakes.core.sections.DefaultSectionRegex.bsv
It is an incomplete list, please add to it if you can.
Sean
-----Original Message-----
From: Liam Bui [mailto:lbui@phemi.com]
Sent: Monday, August 21, 2017 12:18 PM
To: dev@ctakes.apache.org
Cc: user@ctakes.apache.org
Subject: Re: Clinical documents section heading recognizer [EXTERNAL]
Hi Sekhar,
What you described seems to be related to Clinical Document Architecture
(CDA):
https://urldefense.proofpoint.com/v2/url?u=https-3A__apache.googlesource.com_ctakes_-2B_trunk_ctakes-2Dclinical-2Dpipeline_SystemArchitectureOverview.txt&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=zrvmFugHM6SD_d67Z5AlmrKS-XJnKoFH3sPcFUjgu8g&s=2tf5LOZU9ITY3oyo-A7FetDbdBy8YmM57ys78r1rto8&e=
I never manage to get CDA working in cTAKES though.
On Sun, Aug 20, 2017 at 8:13 PM, Hari, Sekhar <se...@cgi.com> wrote:
> Hello there -
>
> With the latest version of cTAKES, is it possible to recognize and
> extract the clinical section headings in clinical documents? For
> example, my use case is like this:
>
> 1. 'Extract the BP readings from the 'Vital Signs' or 'Physical
> Examination' section. If BP reading(s) are mentioned in other places
> of the document too, ignore those readings and consider only the one
> in 'Vital Signs' or 'Physical Examination' section.
>
> 2. Ignore everything mentioned under 'Family History' section.
>
> I would be most grateful if you can share your thoughts / code snippet
> examples in cTAKES.
>
> Thanks,
> Sekhar H.
>
RE: Clinical documents section heading recognizer [EXTERNAL]
[SUSPICIOUS]
Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Edit:
It should be BsvRegexSectionizer, not RegexSectionizer.
To print annotations, possibly ignored by section:
Map<Segment,Collection<IdentifiedAnnoation>> annotationSections = JCasUtil.indexCovered( jCas, Segment.class, IdentifiedAnnotation.class );
for ( Map.Entry<Segment,Collection<IdentifiedAnnotation>> entry : annotationSections.entrySet() ) {
String sectionName = entry.getKey().getPreferredText();
if (sectionName.equals( " Family Medical History" ) {
continue;
}
Entry.getValue().stream()
.filter( a -> !a.getCoveredText().equals( "BP" ) || sectionName.equals( "Vital Signs" ) || sectionName.equals( "Physical Examination" ) )
.foreach( System.out::println );
}
Or something like that.
Sean
-----Original Message-----
From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
Sent: Monday, August 21, 2017 2:37 PM
To: dev@ctakes.apache.org
Cc: user@ctakes.apache.org
Subject: RE: Clinical documents section heading recognizer [EXTERNAL] [SUSPICIOUS]
ss
Hi Sekhar,
You can add the RegexSectionizer in ctakes-core to your pipeline, then code such as:
final Collection<Segment> sections = JCasUtil.select( jCas, Segment.class );
for ( Segment section : sections ) {
System.out.println( section.getId() );
System.out.println( section.getCoveredText() );
}
The section names (id) come from a file listing names and regular expressions in ctakes-core-res org.apache.ctakes.core.sections.DefaultSectionRegex.bsv
It is an incomplete list, please add to it if you can.
Sean
-----Original Message-----
From: Liam Bui [mailto:lbui@phemi.com]
Sent: Monday, August 21, 2017 12:18 PM
To: dev@ctakes.apache.org
Cc: user@ctakes.apache.org
Subject: Re: Clinical documents section heading recognizer [EXTERNAL]
Hi Sekhar,
What you described seems to be related to Clinical Document Architecture
(CDA):
https://urldefense.proofpoint.com/v2/url?u=https-3A__apache.googlesource.com_ctakes_-2B_trunk_ctakes-2Dclinical-2Dpipeline_SystemArchitectureOverview.txt&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=zrvmFugHM6SD_d67Z5AlmrKS-XJnKoFH3sPcFUjgu8g&s=2tf5LOZU9ITY3oyo-A7FetDbdBy8YmM57ys78r1rto8&e=
I never manage to get CDA working in cTAKES though.
On Sun, Aug 20, 2017 at 8:13 PM, Hari, Sekhar <se...@cgi.com> wrote:
> Hello there -
>
> With the latest version of cTAKES, is it possible to recognize and
> extract the clinical section headings in clinical documents? For
> example, my use case is like this:
>
> 1. 'Extract the BP readings from the 'Vital Signs' or 'Physical
> Examination' section. If BP reading(s) are mentioned in other places
> of the document too, ignore those readings and consider only the one
> in 'Vital Signs' or 'Physical Examination' section.
>
> 2. Ignore everything mentioned under 'Family History' section.
>
> I would be most grateful if you can share your thoughts / code snippet
> examples in cTAKES.
>
> Thanks,
> Sekhar H.
>