You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@ctakes.apache.org by "Finan, Sean" <Se...@childrens.harvard.edu> on 2017/08/21 18:37:27 UTC

RE: Clinical documents section heading recognizer [EXTERNAL]

Hi Sekhar, 

You can add the RegexSectionizer in ctakes-core to your pipeline, then code such as:

      final Collection<Segment> sections = JCasUtil.select( jCas, Segment.class );
      for ( Segment section : sections ) {
         System.out.println( section.getId() );
         System.out.println( section.getCoveredText() );
      }

The section names (id) come from a file listing names and regular expressions in ctakes-core-res   org.apache.ctakes.core.sections.DefaultSectionRegex.bsv
It is an incomplete list, please add to it if you can.

Sean

-----Original Message-----
From: Liam Bui [mailto:lbui@phemi.com] 
Sent: Monday, August 21, 2017 12:18 PM
To: dev@ctakes.apache.org
Cc: user@ctakes.apache.org
Subject: Re: Clinical documents section heading recognizer [EXTERNAL]

Hi Sekhar,

What you described seems to be related to Clinical Document Architecture
(CDA):
https://urldefense.proofpoint.com/v2/url?u=https-3A__apache.googlesource.com_ctakes_-2B_trunk_ctakes-2Dclinical-2Dpipeline_SystemArchitectureOverview.txt&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=zrvmFugHM6SD_d67Z5AlmrKS-XJnKoFH3sPcFUjgu8g&s=2tf5LOZU9ITY3oyo-A7FetDbdBy8YmM57ys78r1rto8&e= 

I never manage to get CDA working in cTAKES though.

On Sun, Aug 20, 2017 at 8:13 PM, Hari, Sekhar <se...@cgi.com> wrote:

> Hello there -
>
> With the latest version of cTAKES, is it possible to recognize and 
> extract the clinical section headings in clinical documents? For 
> example, my use case is like this:
>
> 1. 'Extract the BP readings from the 'Vital Signs' or 'Physical 
> Examination' section. If BP reading(s) are mentioned in other places 
> of the document too, ignore those readings and consider only the one 
> in 'Vital Signs' or 'Physical Examination' section.
>
> 2. Ignore everything mentioned under 'Family History' section.
>
> I would be most grateful if you can share your thoughts / code snippet 
> examples in cTAKES.
>
> Thanks,
> Sekhar H.
>

RE: Clinical documents section heading recognizer [EXTERNAL] [SUSPICIOUS]

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.

Edit:

It should be BsvRegexSectionizer, not RegexSectionizer.

To print annotations, possibly ignored by section:

Map<Segment,Collection<IdentifiedAnnoation>> annotationSections = JCasUtil.indexCovered( jCas, Segment.class, IdentifiedAnnotation.class );

for ( Map.Entry<Segment,Collection<IdentifiedAnnotation>> entry : annotationSections.entrySet() ) {
   String sectionName = entry.getKey().getPreferredText();
   if (sectionName.equals( " Family Medical History" ) {
     continue;
   }
  Entry.getValue().stream()
     .filter( a -> !a.getCoveredText().equals( "BP" ) || sectionName.equals( "Vital Signs" ) || sectionName.equals( "Physical Examination" ) )
     .foreach( System.out::println );
}

Or something like that.

Sean


-----Original Message-----
From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu] 
Sent: Monday, August 21, 2017 2:37 PM
To: dev@ctakes.apache.org
Cc: user@ctakes.apache.org
Subject: RE: Clinical documents section heading recognizer [EXTERNAL] [SUSPICIOUS]
ss
Hi Sekhar, 

You can add the RegexSectionizer in ctakes-core to your pipeline, then code such as:

      final Collection<Segment> sections = JCasUtil.select( jCas, Segment.class );
      for ( Segment section : sections ) {
         System.out.println( section.getId() );
         System.out.println( section.getCoveredText() );
      }

The section names (id) come from a file listing names and regular expressions in ctakes-core-res   org.apache.ctakes.core.sections.DefaultSectionRegex.bsv
It is an incomplete list, please add to it if you can.

Sean

-----Original Message-----
From: Liam Bui [mailto:lbui@phemi.com]
Sent: Monday, August 21, 2017 12:18 PM
To: dev@ctakes.apache.org
Cc: user@ctakes.apache.org
Subject: Re: Clinical documents section heading recognizer [EXTERNAL]

Hi Sekhar,

What you described seems to be related to Clinical Document Architecture
(CDA):
https://urldefense.proofpoint.com/v2/url?u=https-3A__apache.googlesource.com_ctakes_-2B_trunk_ctakes-2Dclinical-2Dpipeline_SystemArchitectureOverview.txt&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=zrvmFugHM6SD_d67Z5AlmrKS-XJnKoFH3sPcFUjgu8g&s=2tf5LOZU9ITY3oyo-A7FetDbdBy8YmM57ys78r1rto8&e= 

I never manage to get CDA working in cTAKES though.

On Sun, Aug 20, 2017 at 8:13 PM, Hari, Sekhar <se...@cgi.com> wrote:

> Hello there -
>
> With the latest version of cTAKES, is it possible to recognize and 
> extract the clinical section headings in clinical documents? For 
> example, my use case is like this:
>
> 1. 'Extract the BP readings from the 'Vital Signs' or 'Physical 
> Examination' section. If BP reading(s) are mentioned in other places 
> of the document too, ignore those readings and consider only the one 
> in 'Vital Signs' or 'Physical Examination' section.
>
> 2. Ignore everything mentioned under 'Family History' section.
>
> I would be most grateful if you can share your thoughts / code snippet 
> examples in cTAKES.
>
> Thanks,
> Sekhar H.
>