You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@ctakes.apache.org by "Finan, Sean" <Se...@childrens.harvard.edu> on 2018/02/27 13:21:52 UTC

RE: Trying to Understand cTAKES [EXTERNAL]

Hi Don,

The default clinical pipeline will provide a little more information:

https://cwiki.apache.org/confluence/display/CTAKES/Default+Clinical+Pipeline

Sean


-----Original Message-----
From: Don Flinn [mailto:flinn@alum.mit.edu] 
Sent: Tuesday, February 27, 2018 4:16 AM
To: dev@ctakes.apache.org
Subject: Trying to Understand cTAKES [EXTERNAL]

HI,
I'm new to cTAKES and am trying to understand the product.  One of my goals is to read in medical research documents in a given medical domain, glean semantic information from them and put the information into a database, which I can query.  I have run through the cTAKES examples and they seem to go as far as parts of speech (POS).  Poking around I found ClinicalPipelineFactory.java, which computes Subject.  Are there other examples which go further into the semantics?

Thanks for any help
Don

Re: Trying to Understand cTAKES [EXTERNAL]

Posted by andy mcmurry <mc...@gmail.com>.

Don
You may want to see "semantic medline" and semrep from NLM-- in general
ctakes is better for physician notes whereas metamap may be better on
literature. Extracting predicate relationships-- you can even download a
huge database of preprocessed papers



On Feb 27, 2018 10:43 AM, "Don Flinn" <fl...@alum.mit.edu> wrote:

> Hi Sean,
>
> I ran the batch file and got the following error, so I have something
> basically wrong, but don't know what.  Any help appreciated - Don
>
> My input:
>  bin/runClinicalPipeline  -i /tmp/files/CtakesInput  --xmiOut
> /tmp/files/CtakesOutput  --user <my userName> --pass <myPass>
>
> myUsername and password are correct as I have used them in a number of the
> cTakes examples and they were verified.  The xmiOut directory contains a
> three short text files, but I don't think the script even got that far.  I
> don't know what the error message means or how to correct it.
>
> The output:
> 27 Feb 2018 13:01:01  INFO SentenceDetector - Sentence detector model file:
> org/apache/ctakes/core/sentdetect/sd-med-model.zip
> 27 Feb 2018 13:01:01  INFO TokenizerAnnotatorPTB - Initializing
> org.apache.ctakes.core.ae.TokenizerAnnotatorPTB
> 27 Feb 2018 13:01:01  INFO ContextDependentTokenizerAnnotator - Finite
> state machines loaded.
> 27 Feb 2018 13:01:01  INFO POSTagger - POS tagger model file:
> org/apache/ctakes/postagger/models/mayo-pos.zip
> 27 Feb 2018 13:01:01  INFO Chunker - Chunker model file:
> org/apache/ctakes/chunker/models/chunker-model.zip
> 27 Feb 2018 13:01:02  INFO AbstractJCasTermAnnotator - Using dictionary
> lookup window type: org.apache.ctakes.typesystem.type.textspan.Sentence
> 27 Feb 2018 13:01:02  INFO AbstractJCasTermAnnotator - Exclusion tagset
> loaded: CC CD DT EX IN LS MD PDT POS PP PP$ PRP PRP$ RP TO VB VBD VBG VBN
> VBP VBZ WDT WP WPS WRB
> 27 Feb 2018 13:01:02  INFO AbstractJCasTermAnnotator - Using minimum term
> text span: 3
> 27 Feb 2018 13:01:02  INFO AbstractJCasTermAnnotator - Using Dictionary
> Descriptor: org/apache/ctakes/dictionary/lookup/fast/sno_rx_16ab.xml
> 27 Feb 2018 13:01:02 ERROR PiperFileRunner - Initialization of annotator
> class "org.apache.ctakes.dictionary.lookup2.ae.DefaultJCasTermAnnotator"
> failed.  (Descriptor: <unknown>)
>
>
> On Tue, Feb 27, 2018 at 8:21 AM, Finan, Sean <
> Sean.Finan@childrens.harvard.edu> wrote:
>
> > Hi Don,
> >
> > The default clinical pipeline will provide a little more information:
> >
> > https://cwiki.apache.org/confluence/display/CTAKES/
> > Default+Clinical+Pipeline
> >
> > Sean
> >
> >
> > -----Original Message-----
> > From: Don Flinn [mailto:flinn@alum.mit.edu]
> > Sent: Tuesday, February 27, 2018 4:16 AM
> > To: dev@ctakes.apache.org
> > Subject: Trying to Understand cTAKES [EXTERNAL]
> >
> > HI,
> > I'm new to cTAKES and am trying to understand the product.  One of my
> > goals is to read in medical research documents in a given medical domain,
> > glean semantic information from them and put the information into a
> > database, which I can query.  I have run through the cTAKES examples and
> > they seem to go as far as parts of speech (POS).  Poking around I found
> > ClinicalPipelineFactory.java, which computes Subject.  Are there other
> > examples which go further into the semantics?
> >
> > Thanks for any help
> > Don
> >
>

RE: Trying to Understand cTAKES [EXTERNAL]

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.

Hi Don,

Did you install that dictionary?

-----Original Message-----
From: Don Flinn [mailto:flinn@alum.mit.edu] 
Sent: Tuesday, February 27, 2018 1:43 PM
To: dev@ctakes.apache.org
Subject: Re: Trying to Understand cTAKES [EXTERNAL]

Hi Sean,

I ran the batch file and got the following error, so I have something basically wrong, but don't know what.  Any help appreciated - Don

My input:
 bin/runClinicalPipeline  -i /tmp/files/CtakesInput  --xmiOut /tmp/files/CtakesOutput  --user <my userName> --pass <myPass>

myUsername and password are correct as I have used them in a number of the cTakes examples and they were verified.  The xmiOut directory contains a three short text files, but I don't think the script even got that far.  I don't know what the error message means or how to correct it.

The output:
27 Feb 2018 13:01:01  INFO SentenceDetector - Sentence detector model file:
org/apache/ctakes/core/sentdetect/sd-med-model.zip
27 Feb 2018 13:01:01  INFO TokenizerAnnotatorPTB - Initializing org.apache.ctakes.core.ae.TokenizerAnnotatorPTB
27 Feb 2018 13:01:01  INFO ContextDependentTokenizerAnnotator - Finite state machines loaded.
27 Feb 2018 13:01:01  INFO POSTagger - POS tagger model file:
org/apache/ctakes/postagger/models/mayo-pos.zip
27 Feb 2018 13:01:01  INFO Chunker - Chunker model file:
org/apache/ctakes/chunker/models/chunker-model.zip
27 Feb 2018 13:01:02  INFO AbstractJCasTermAnnotator - Using dictionary lookup window type: org.apache.ctakes.typesystem.type.textspan.Sentence
27 Feb 2018 13:01:02  INFO AbstractJCasTermAnnotator - Exclusion tagset
loaded: CC CD DT EX IN LS MD PDT POS PP PP$ PRP PRP$ RP TO VB VBD VBG VBN VBP VBZ WDT WP WPS WRB
27 Feb 2018 13:01:02  INFO AbstractJCasTermAnnotator - Using minimum term text span: 3
27 Feb 2018 13:01:02  INFO AbstractJCasTermAnnotator - Using Dictionary
Descriptor: org/apache/ctakes/dictionary/lookup/fast/sno_rx_16ab.xml
27 Feb 2018 13:01:02 ERROR PiperFileRunner - Initialization of annotator class "org.apache.ctakes.dictionary.lookup2.ae.DefaultJCasTermAnnotator"
failed.  (Descriptor: <unknown>)

On Tue, Feb 27, 2018 at 8:21 AM, Finan, Sean < Sean.Finan@childrens.harvard.edu> wrote:

> Hi Don,
>
> The default clinical pipeline will provide a little more information:
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> confluence_display_CTAKES_&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMS
> dioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=GEhcx5T2wN
> d8c6ZmfBIk8o0-KVu5gALmhM9nDMS02_E&s=WHyO5yQVCQ_pYn001UTHYv1zzGUErM6ruN
> 9ER9DJrKo&e=
> Default+Clinical+Pipeline
>
> Sean
>
>
> -----Original Message-----
> From: Don Flinn [mailto:flinn@alum.mit.edu]
> Sent: Tuesday, February 27, 2018 4:16 AM
> To: dev@ctakes.apache.org
> Subject: Trying to Understand cTAKES [EXTERNAL]
>
> HI,
> I'm new to cTAKES and am trying to understand the product.  One of my 
> goals is to read in medical research documents in a given medical 
> domain, glean semantic information from them and put the information 
> into a database, which I can query.  I have run through the cTAKES 
> examples and they seem to go as far as parts of speech (POS).  Poking 
> around I found ClinicalPipelineFactory.java, which computes Subject.  
> Are there other examples which go further into the semantics?
>
> Thanks for any help
> Don
>

Re: Trying to Understand cTAKES [EXTERNAL]

Posted by Don Flinn <fl...@alum.mit.edu>.

Hi Sean,

I ran the batch file and got the following error, so I have something
basically wrong, but don't know what.  Any help appreciated - Don

My input:
 bin/runClinicalPipeline  -i /tmp/files/CtakesInput  --xmiOut
/tmp/files/CtakesOutput  --user <my userName> --pass <myPass>

myUsername and password are correct as I have used them in a number of the
cTakes examples and they were verified.  The xmiOut directory contains a
three short text files, but I don't think the script even got that far.  I
don't know what the error message means or how to correct it.

The output:
27 Feb 2018 13:01:01  INFO SentenceDetector - Sentence detector model file:
org/apache/ctakes/core/sentdetect/sd-med-model.zip
27 Feb 2018 13:01:01  INFO TokenizerAnnotatorPTB - Initializing
org.apache.ctakes.core.ae.TokenizerAnnotatorPTB
27 Feb 2018 13:01:01  INFO ContextDependentTokenizerAnnotator - Finite
state machines loaded.
27 Feb 2018 13:01:01  INFO POSTagger - POS tagger model file:
org/apache/ctakes/postagger/models/mayo-pos.zip
27 Feb 2018 13:01:01  INFO Chunker - Chunker model file:
org/apache/ctakes/chunker/models/chunker-model.zip
27 Feb 2018 13:01:02  INFO AbstractJCasTermAnnotator - Using dictionary
lookup window type: org.apache.ctakes.typesystem.type.textspan.Sentence
27 Feb 2018 13:01:02  INFO AbstractJCasTermAnnotator - Exclusion tagset
loaded: CC CD DT EX IN LS MD PDT POS PP PP$ PRP PRP$ RP TO VB VBD VBG VBN
VBP VBZ WDT WP WPS WRB
27 Feb 2018 13:01:02  INFO AbstractJCasTermAnnotator - Using minimum term
text span: 3
27 Feb 2018 13:01:02  INFO AbstractJCasTermAnnotator - Using Dictionary
Descriptor: org/apache/ctakes/dictionary/lookup/fast/sno_rx_16ab.xml
27 Feb 2018 13:01:02 ERROR PiperFileRunner - Initialization of annotator
class "org.apache.ctakes.dictionary.lookup2.ae.DefaultJCasTermAnnotator"
failed.  (Descriptor: <unknown>)

On Tue, Feb 27, 2018 at 8:21 AM, Finan, Sean <
Sean.Finan@childrens.harvard.edu> wrote:

> Hi Don,
>
> The default clinical pipeline will provide a little more information:
>
> https://cwiki.apache.org/confluence/display/CTAKES/
> Default+Clinical+Pipeline
>
> Sean
>
>
> -----Original Message-----
> From: Don Flinn [mailto:flinn@alum.mit.edu]
> Sent: Tuesday, February 27, 2018 4:16 AM
> To: dev@ctakes.apache.org
> Subject: Trying to Understand cTAKES [EXTERNAL]
>
> HI,
> I'm new to cTAKES and am trying to understand the product.  One of my
> goals is to read in medical research documents in a given medical domain,
> glean semantic information from them and put the information into a
> database, which I can query.  I have run through the cTAKES examples and
> they seem to go as far as parts of speech (POS).  Poking around I found
> ClinicalPipelineFactory.java, which computes Subject.  Are there other
> examples which go further into the semantics?
>
> Thanks for any help
> Don
>