You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ctakes.apache.org by "Miller, Timothy" <Ti...@childrens.harvard.edu> on 2017/09/29 19:14:34 UTC

Re: Enabling drugner pipeline and identifying dates [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS]

It is a very busy time for me but this is on my todo list. Don't be
afraid to ping in a week or so if you don't hear anything.

Tim

On Fri, 2017-09-29 at 14:04 +0000, Finan, Sean wrote:
> Hi Gandhi,
> > 
> > Did you mean that with the text I sent, the co-reference
> > superscript-1 will be lost?
> Yes.  Well, to be more clear, the coreference that was resolved as #1
> in your original sentence alone will be lost.  However, there are
> eight or none coreference chains discovered in your full paragraph,
> and one of those will have superscript 1s.
> 
> > 
> > Could someone have a look and know your thoughts please?
> Thank you for creating the jira and the patch.  I am sure that
> somebody will take a look.
> 
> Thanks,
> Sean
> 
> 
> -----Original Message-----
> From: Gandhi Rajan Natarajan [mailto:Gandhi.Natarajan@arisglobal.com]
>  
> Sent: Friday, September 29, 2017 2:25 AM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL] [SUSPICIOUS]
> 
> Hi Sean,
> 
> Thanks again for the response. I guess its mistake from my side that
> I dint send the complete text. Did you mean that with the text I
> sent, the co-reference superscript-1 will be lost?
> 
> Also as per your advice, We have created an issue  - https://urldefen
> se.proofpoint.com/v2/url?u=https-
> 3A__issues.apache.org_jira_browse_CTAKES-
> 2D459&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67Gv
> lGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=iyJsQ5ekdL7Vf_wcjADsUYBjMaVho
> hpozRybEEpwNUg&s=KHAFRjKk4tjMJGHaIjrUuqk6XAtVFYP0sVuN5ODLs3Q&e=   for
> measurement FSM changes and attached the modified file changes. Could
> someone have a look and know your thoughts please?
> 
> Regards,
> Gandhi
> 
> 
> -----Original Message-----
> From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
> Sent: Thursday, September 28, 2017 8:21 PM
> To: dev@ctakes.apache.org
> Cc: Miller, Timothy <Ti...@childrens.harvard.edu>
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL] [SUSPICIOUS]
> 
> Hi Gandhi,
> 
> I don't recall you sending me that entire snippet of text.  I think
> that I only had your single example sentence.
> You have discovered one of the quirks of software: "change the data,
> change the result."
> Ctakes is a system with many moving parts.  Things that precede or
> follow your original example sentence will change the evaluation of
> that sentence.
> With the pipeline you are using and the full note, you should see a
> number (mine is 4) next to the first "thalomid" in the original
> example sentence.  If you click that number you should see (to the
> right) 4 instances of "thalomid".
> Tim can correct me here, but maybe the coreference module ranked the
> links between "thalomid" as much higher than the rank between "study
> treatment of thalomid 200mg" and "the treatment of hepatocellular
> carcinoma" and discarded the encapsulating treatment texts from
> markables?  It is probably more complex than that.
> 
> > 
> > we have also made some code changes in MeasurementFSM.java to
> > identify certain measurements like '20 mg/m2' which was not
> > identified out of the box.  Should we send the code changes to you
> > so that you can consider the same to be productized ? Please
> > advise."
> I don't know if you've noticed the recent emails on the dev list
> involving Alexandru Zbarcea.  Alex has been creating or commenting on
> Jira items and attaching code for  fixes and enhancements.  This is a
> widely used process and is fairly easy to follow.   I think that the
> following links are relevant:
> Working with issues:  https://urldefense.proofpoint.com/v2/url?u=http
> s-3A__confluence.atlassian.com_jiracoreserver073_working-2Dwith-
> 2Dissues-
> 2D861257307.html&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxe
> FU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=iyJsQ5ekdL7Vf_wcjA
> DsUYBjMaVhohpozRybEEpwNUg&s=2BFHffDc3fS5DTAXq3M5MsGBv_uG0t3MceVT38alp
> 2Q&e= 
> Creating patches:   https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__confluence.atlassian.com_crucible_creating-2Dpatch-2Dfiles-2Dfor-
> 2Dpre-2Dcommit-2Dreviews-
> 2D298977458.html&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxe
> FU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=iyJsQ5ekdL7Vf_wcjA
> DsUYBjMaVhohpozRybEEpwNUg&s=JXOJanO4pjISmYVdCpcTLHD72n0_wzJMa7xrYDT1G
> yc&e= 
> Attaching files:   https://urldefense.proofpoint.com/v2/url?u=https-3
> A__confluence.atlassian.com_jiracorecloud_attaching-2Dfiles-2Dand-
> 2Dscreenshots-2Dto-2Dissues-
> 2D765593805.html&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxe
> FU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=iyJsQ5ekdL7Vf_wcjA
> DsUYBjMaVhohpozRybEEpwNUg&s=WT5NtwXSeAbZOb6iAojfglU5OKMnCTmyyo1HUUggC
> rE&e= 
> 
> I don't know if you have a jira account and permissions for the
> ctakes project.  An administrator may need to set that up for you.
> 
> Thanks,
> Sean
> 
> -----Original Message-----
> From: Gandhi Rajan Natarajan [mailto:Gandhi.Natarajan@arisglobal.com]
> Sent: Thursday, September 28, 2017 4:09 AM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL] [SUSPICIOUS]
> 
> Hi Sean,
> 
> Thanks for the response. I was able to see the co-reference
> superscript using the html file that you sent. Interestingly even I
> was able to generate the sample HTML using  piper GUI by  having only
> that single line - " The patient started study treatment of Thalomid
> 200mg (days 1-21), and Epirubicin, 20 mg/m2 (days 1, 8, and 15) on
> 06/07/02 for the treatment of hepatocellular carcinoma. " in the
> input file.
> 
> But when I change the input file content with the following lines:
> 
> "This patient is participating in a Non-IND study; Protocol CG-
> 000424: "Phase I/II of Thalidomide and Epirubicin in Patients with
> Unresectable or Metastatic Hepatocellular Carcinoma".Information has
> been received from the investigator regarding an 82 year-old male
> patient who had gastrointestinal bleeding while on Thalomid,
> Epirubicin, and Coumadin. He had a past medical history of
> diverticulosis in 03/02 and a right atrial clot from intraventricular
> catheter (IVC) for which he was started on Coumadin. During the
> hospitalization for a right atrial clot in 03/02 hepatocellular
> carcinoma was first noted and he was referred to an oncologist.  The
> patient started study treatment of Thalomid 200mg (days 1-21), and
> Epirubicin, 20 mg/m2 (days 1, 8, and 15) on 06/07/02 for the
> treatment of hepatocellular carcinoma.  He was concomitantly
> receiving Cardura, Ambien (for insomnia), Megace, Coumadin, and
> Oxycodone. This patient presented to the emergency room with the
> chief complaint of hematochezia. He reported noticing bright red
> blood and small clots mixed in with his stool. On 07/13/02, he was
> admitted due to gastrointestinal bleed.  The physician ordered 2
> large bore intravenous lines and planned to transfuse for hematocrit
> less than 30%. Due to the  INR (international normalized ratio) level
> of 3.0, Coumadin was held. He was also noted to have bilateral lower
> extremity edema with dyspnea on exertion.  On 07/13/02, he had a
> chest X-ray PA and lateral done that showed no evidence of acute
> pneumonia or congestive heart failure.  On 07/14/02, he underwent  an
> ultrasound which was negative for deep vein thrombosis. This patient
> did not take Thalomid on the day of his admittance to the hospital,
> but resumed treatment shortly after with no return of symptoms. On
> 07/15/02, he was discharged in stable condition. There have been no
> further reports of bleeding at this time. Thedoctor has assessed the
> hematochezia as related to Coumadin treatment and previously
> diagnosed diverticulosis, and not to protocol therapy with Thalomid
> and Epirubicin.Additional information received from the investigator
> on 27Aug02 reveals that this male patient began on 07Jun02 two cycles
> of therapy with Thalidomide and Epirubicin.  His post cycle two
> computed tomography scans revealed increase in size of liver lesion
> with development of multiple new satellite nodules.  On 29Jul02, the
> investigator removed this patient from protocol for progressive
> disease and recommended hospice care.  After seeking a second opinion
> from two other institutions, this patient was admitted to hospice on
> 05Aug02.  On 20Aug02, the investigator noted that this patient was
> suffering worsening fatigue and got tired getting out of his
> chair.  On 25Aug02, this patient died due to disease
> progression.  The investigator assessed the death as not related to
> study treatment and expected"
> 
> The co-reference superscript is lost by then. Did you tried with the
> complete text above by any chance in your piper GUI? Also I guess you
> did not notice the question on my last post - " Sean, we have also
> made some code changes in MeasurementFSM.java to identify certain
> measurements like '20 mg/m2' which was not identified out of the
> box.  Should we send the code changes to you so that you can consider
> the same to be productized ? Please advise."
> 
> 
> Regards,
> Gandhi
> 
> 
> -----Original Message-----
> From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
> Sent: Wednesday, September 27, 2017 5:53 PM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL] [SUSPICIOUS]
> 
> Hi Gandhi,
> 
> I am glad that you are feeling better.
> I don't understand why you aren't getting the same output as me.  I
> just ran your example sentence with your piper with a fresh checkout
> and get the html below.  The css follows.  Copy and paste into a file
> and see if you see the corefs.
> 
> /////////////////////////////////////////////////////  html, copy
> into file  /////////////////////////////////////////////////
> 
> <!DOCTYPE html>
> <html>
> <head>
>   <title>OneLiner Output</title>
> </head>
> <body>
> <link rel="stylesheet" href="ctakes.pretty.css" type="text/css"
> media="screen"> <h2>OneLiner</h2>  <i>Text processing finished on: 9
> 27 2017, 08:15:31</i> <hr>
> 
> <div id="content">
> 
> <p>
> The patient <span class="AFF_"
> onClick="iaf('AFF_NL_EVTNL_startedNL_SPC_[before] doc timeNL_NL_')"
> TIP="Event ">started</span> study <span class="AFF_"
> onClick="iaf('AFF_NL_EVTNL_treatmentNL_SPC_[before] doc
> timeNL_NL_PRCNL_treatmentNL_SPC_C0087111NL_SPC_[Therapeutic
> procedure]NL_SPC_[before] doc timeNL_NL_')" TIP="Event Procedure
> ">treatment</span><span class="PRC"><sup>&bull;</sup></span> of <span
> class="AFF_"
> onClick="iaf('AFF_NL_DRGNL_ThalomidNL_SPC_C0723668NL_SPC_[before] doc
> timeNL_NL_')" TIP="Drug ">Thalomid</span><span
> class="DRG"><sup>&bull;</sup></span> <span class="AFF_"
> onClick="iaf('AFF_NL_EVTNL_200mgNL_SPC_[before] doc timeNL_NL_')"
> TIP="Event ">200mg</span><span class="UNK"
> onClick="crf1()"><sup>1</sup></span> ( <span class="GNR_"
> onClick="iaf('GNR_NL_TMXNL_daysNL_NL_')" TIP="Time ">days</span> 1 -
> 21 ) , and <span class="AFF_"
> onClick="iaf('AFF_NL_DRGNL_EpirubicinNL_SPC_C0014582NL_SPC_[before]
> doc timeNL_NL_')" TIP="Drug ">Epirubicin</span><span
> class="DRG"><sup>&bull;</sup></span> , 20 mg / m2 ( <span
> class="GNR_" onClick="iaf('GNR_NL_TMXNL_days 1 , 8NL_NL_')" TIP="Time
> ">days 1 , 8</span> , and 15 ) on <span class="GNR_"
> onClick="iaf('GNR_NL_TMXNL_06 / 07 / 02NL_SPC_[CONTAINS]
> treatmentNL_NL_')" TIP="Time ">06 / 07 / 02</span> for the <span
> class="AFF_" onClick="iaf('AFF_NL_EVTNL_treatmentNL_SPC_[before] doc
> timeNL_SPC_06 / 07 / 02
> [CONTAINS]NL_NL_PRCNL_treatmentNL_SPC_C0087111NL_SPC_[Therapeutic
> procedure]NL_SPC_[before] doc timeNL_NL_')" TIP="Event Procedure
> ">treatment</span><span class="PRC"><sup>&bull;</sup></span> of <span
> class="AFF_" onClick="iaf('AFF_NL_DISNL_hepatocellular
> carcinomaNL_SPC_C2239176NL_SPC_[Liver carcinoma]NL_SPC_[before] doc
> timeNL_NL_')" TIP="Disorder ">hepatocellular </span><span
> class="AFF_" onClick="iaf('AFF_NL_DISNL_hepatocellular
> carcinomaNL_SPC_C2239176NL_SPC_[Liver carcinoma]NL_SPC_[before] doc
> timeNL_NL_EVTNL_carcinomaNL_SPC_[before] doc timeNL_NL_')"
> TIP="Disorder Event ">carcinoma</span><span class="DIS"
> onClick="crf1()"><sup>1</sup></span> .
> <br>
> 
> </p>
> 
> </div>
> 
> <div id="ia"> Annotation Information </div> <script
> type="text/javascript">
>   function iaf(txt) {
>     var aff=txt.replace( /AFF_/g,"<br><h3>Affirmed</h3>" );
>     var neg=aff.replace( /NEG_/g,"<br><h3>Negated</h3>" );
>     var unc=neg.replace( /UNC_/g,"<br><h3>Uncertain</h3>" );
>     var unn=unc.replace( /UNN_/g,"<br><h3>Uncertain, Negated</h3>" );
>     var ant=unn.replace( /ANT/g,"<b>Anatomical Site</b>" );
>     var dis=ant.replace( /DIS/g,"<b>Disease/ Disorder</b>" );
>     var fnd=dis.replace( /FND/g,"<b>Sign/ Symptom</b>" );
>     var prc=fnd.replace( /PRC/g,"<b>Procedure</b>" );
>     var drg=prc.replace( /DRG/g,"<b>Medication</b>" );
>     var evt=drg.replace( /EVT/g,"<b>Event</b>" );
>     var tmx=evt.replace( /TMX/g,"<b>Time</b>" );
>     var unk=tmx.replace( /UNK/g,"<b>Unknown</b>" );
>     var spc=unk.replace(
> /SPC_/g,"&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;" );
>     var prf1=spc.replace( /\[/g,"<i>" );
>     var prf2=prf1.replace( /\]/g,"</i>" );
>     var nl=prf2.replace( /NL_/g,"<br>" );
>     document.getElementById("ia").innerHTML = nl;
>   }
>   function crf1() {
>     document.getElementById("ia").innerHTML = "<br><h3>Coreference
> Chain</h3>study treatment of Thalomid 200mg<br>the treatment of
> hepatocellular carcinoma";
>   }
> </script></body>
> </html>
> 
> 
> 
> /////////////////////////////////////////////////////  css, copy into
> file named ctakes.pretty.css in same directory as
> html   /////////////////////////////////////////////////
> 
> 
> 
> .GNR_ {
>   position: relative;
>   display: inline-block gray;
>   border-bottom: 0.10em solid gray;
> }
> 
> .AFF_ {
>   position: relative;
>   display: inline-block green;
>   border-bottom: 0.15em solid green;
> }
> 
> .UNC_ {
>   position: relative;
>   display: inline-block gold;
>   border-bottom: 0.16em dotted gold;
> }
> 
> .NEG_ {
>   position: relative;
>   display: inline-block red;
>   border-bottom: 0.16em dashed red;
> }
> 
> .UNN_ {
>   position: relative;
>   display: inline-block orange;
>   border-bottom: 0.16em dashed orange;
> }
> 
> .FND {
>   color: magenta;
> }
> 
> .DIS {
>   color: black;
> }
> 
> .DRG {
>   color: red;
> }
> 
> .PRC {
>   color: blue;
> }
> 
> .ANT {
>   color: gray;
> }
> 
> .UNK {
>   color: gray;
> }
> 
> [TIP] {
>   position: relative;
>   z-index: 2;
>   cursor: pointer;
> }
> [TIP]::before,
> [TIP]::after {
>   visibility: hidden;
>   -ms-filter: "progid:DXImageTransform.Microsoft.Alpha(Opacity=0)";
>   filter: progid: DXImageTransform.Microsoft.Alpha(Opacity=0);
>   opacity: 0;
>   pointer-events: none;
> }
> [TIP]::before {
>   position: absolute;
>   bottom: 0%;
>   left: 100%;
>   margin-bottom: 5px;
>   padding: 7px;
>   -webkit-border-radius: 3px;
>   -moz-border-radius: 3px;
>   border-radius: 3px;
>   background-color: #000;
>   background-color: hsla(0, 0%, 20%, 0.9);
>   color: #fff;
>   content: attr(TIP);
>   text-align: center;
>   font-size: 14px;
>   line-height: 1.2;
> }
> [TIP]:hover::before,
> [TIP]:hover::after {
>   visibility: visible;
>   -ms-filter: "progid:DXImageTransform.Microsoft.Alpha(Opacity=100)";
>   filter: progid: DXImageTransform.Microsoft.Alpha(Opacity=100);
>   opacity: 1;
> }
> 
> div#ia {
>   position: fixed;
>   top: 0;
>   right: 0;
>   width: 20%;
>   height: 100%;
>   padding: 10px;
>   overflow: auto;
>   background-color: lightgray;
> }
> 
> div#content {
>   width: 79%;
>   height: 100%;
>   padding: 10px;
>   overflow: auto;
> }
> 
> 
> 
> 
> 
> 
> 
> 
> 
> -----Original Message-----
> From: Gandhi Rajan Natarajan [mailto:Gandhi.Natarajan@arisglobal.com]
> Sent: Wednesday, September 27, 2017 4:40 AM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL] [SUSPICIOUS]
> 
> Hi Sean,
> 
> Sorry for the delayed response as I was out of office due to illness.
> If I don't add BackwardsTimeAnnotator, I don't see any error related
> to isTraining param. But still couldn't get the superscript co-
> reference working. Please note that I am using the latest 4.0.1 jars.
> The piper file and console log messages are as follows:
> 
> PIPER FILE:
> // Advanced Tokenization: Regex sectionization, BIO Sentence Detector
> (lumper), Paragraphs,Lists load AdvancedTokenizerPipeline.piper add
> ContextDependentTokenizerAnnotator
> add POSTagger
> // Chunkers
> load ChunkerSubPipe.piper
> // Default fast dictionary lookup
> load DictionarySubPipe.piper
> add org.apache.ctakes.drugner.ae.DrugMentionAnnotator
> // Cleartk Entity Attributes
> load AttributeCleartkSubPipe.piper
> // Relations
> load RelationSubPipe.piper
> // Temporal
> load TemporalSubPipe.piper
> // Coreferences
> load CorefSubPipe.piper
> //add org.apache.ctakes.temporal.ae.BackwardsTimeAnnotator
> // Html output
> add pretty.html.HtmlTextWriter
> // XMl writer
> add FileTreeXmiWriter
> 
> CONSOLE LOG:
> 
> 22 Sep 2017 13:59:44  INFO ClearNLPSemanticRoleLabelerAE - Finished
> initializing
> 22 Sep 2017 13:59:44  INFO CleartkAnalysisEngine - Starting
> initializing for Assigning Attributes
> 22 Sep 2017 13:59:46  INFO CleartkAnalysisEngine - Finished
> initializing
> 22 Sep 2017 13:59:46  INFO ModifierExtractorAnnotator - Starting
> initializing
> 22 Sep 2017 13:59:46  INFO ModifierExtractorAnnotator - Finished
> initializing
> 22 Sep 2017 13:59:46  INFO DegreeOfRelationExtractorAnnotator -
> Starting initializing
> 22 Sep 2017 13:59:46  INFO DegreeOfRelationExtractorAnnotator -
> Finished initializing
> 22 Sep 2017 13:59:46  INFO LocationOfRelationExtractorAnnotator -
> Starting initializing
> 22 Sep 2017 13:59:46  INFO LocationOfRelationExtractorAnnotator -
> Finished initializing
> 22 Sep 2017 13:59:46  INFO BackwardsTimeAnnotator - Starting
> initializing
> 22 Sep 2017 13:59:46  INFO BackwardsTimeAnnotator - Finished
> initializing
> 22 Sep 2017 13:59:46  INFO DocTimeRelAnnotator - Starting
> initializing
> 22 Sep 2017 13:59:48  INFO DocTimeRelAnnotator - Finished
> initializing
> 22 Sep 2017 13:59:48  INFO EventTimeRelationAnnotator - Starting
> initializing
> 22 Sep 2017 13:59:49  INFO EventTimeRelationAnnotator - Finished
> initializing
> 22 Sep 2017 13:59:49  INFO EventEventRelationAnnotator - Starting
> initializing
> 22 Sep 2017 13:59:51  INFO EventEventRelationAnnotator - Finished
> initializing
> 22 Sep 2017 13:59:51  INFO ConstituencyParser - Initializing
> parser...
> 22 Sep 2017 13:59:54  INFO RegexSectionizer - Annotating Sections ...
> 22 Sep 2017 13:59:55  INFO RegexSectionizer - Finished processing
> 22 Sep 2017 13:59:55  INFO SentenceDetectorAnnotatorBIO - Starting
> processing ...
> 22 Sep 2017 13:59:55  INFO SentenceDetectorAnnotatorBIO - Finished
> processing
> 22 Sep 2017 13:59:55  INFO ParagraphAnnotator - Annotating Paragraphs
> ...
> 22 Sep 2017 13:59:55  INFO ParagraphAnnotator - Finished processing
> 22 Sep 2017 13:59:55  INFO ParagraphSentenceFixer - Adjusting
> Sentences overlapping Paragraphs ...
> 22 Sep 2017 13:59:55  INFO ParagraphSentenceFixer - Finished
> Processing
> 22 Sep 2017 13:59:55  INFO ListAnnotator - Annotating Lists ...
> 22 Sep 2017 13:59:55  INFO ListAnnotator - Finished processing
> 22 Sep 2017 13:59:55  INFO ListSentenceFixer - Adjusting Sentences
> overlapping Lists ...
> 22 Sep 2017 13:59:55  INFO ListSentenceFixer - Finished Processing
> 22 Sep 2017 13:59:55  INFO TokenizerAnnotatorPTB - process(JCas) in
> org.apache.ctakes.core.ae.TokenizerAnnotatorPTB
> 22 Sep 2017 13:59:55  INFO ContextDependentTokenizerAnnotator -
> process(JCas)
> 22 Sep 2017 13:59:55  INFO POSTagger - process(JCas)
> 22 Sep 2017 13:59:55  INFO Chunker -  process(JCas)
> 22 Sep 2017 13:59:55  INFO ChunkAdjuster -  process(JCas)
> 22 Sep 2017 13:59:55  INFO ChunkAdjuster -  process(JCas)
> 22 Sep 2017 13:59:55  INFO AbstractJCasTermAnnotator - Finding Named
> Entities ...
> 22 Sep 2017 13:59:55  INFO AbstractJCasTermAnnotator - Finished
> processing
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - process dev (JCas)
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:56  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:56  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:56  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:56  INFO ClearNLPDependencyParserAE - Dependency
> parser starting with thread:pool-2-thread-1
> 22 Sep 2017 13:59:56  INFO ClearNLPDependencyParserAE - Dependency
> parser ending with thread:pool-2-thread-1
> 22 Sep 2017 13:59:56  INFO ClearNLPSemanticRoleLabelerAE - Starting
> processing ...
> 22 Sep 2017 13:59:56  INFO ClearNLPSemanticRoleLabelerAE - Finished
> processing
> 22 Sep 2017 13:59:56  INFO CleartkAnalysisEngine - Assigning
> Attributes ...
> 22 Sep 2017 13:59:56  INFO CleartkAnalysisEngine - Finished Assigning
> Attributes
> 22 Sep 2017 13:59:56  INFO ModifierExtractorAnnotator - Starting
> processing ...
> 22 Sep 2017 13:59:56  INFO ModifierExtractorAnnotator - Finished
> processing
> 22 Sep 2017 13:59:56  INFO DegreeOfRelationExtractorAnnotator -
> Starting processing ...
> 22 Sep 2017 13:59:56  INFO DegreeOfRelationExtractorAnnotator -
> Finished processing
> 22 Sep 2017 13:59:56  INFO LocationOfRelationExtractorAnnotator -
> Starting processing ...
> 22 Sep 2017 13:59:57  INFO LocationOfRelationExtractorAnnotator -
> Finished processing
> 22 Sep 2017 13:59:57  INFO BackwardsTimeAnnotator - Starting
> processing ...
> 22 Sep 2017 13:59:57  INFO BackwardsTimeAnnotator - Finished
> processing
> 22 Sep 2017 13:59:57  INFO DocTimeRelAnnotator - Starting processing
> ...
> 22 Sep 2017 13:59:58  INFO DocTimeRelAnnotator - Finished processing
> 22 Sep 2017 13:59:58  INFO EventTimeRelationAnnotator - Starting
> processing ...
> 22 Sep 2017 13:59:59  INFO EventTimeRelationAnnotator - Finished
> processing
> 22 Sep 2017 13:59:59  INFO EventEventRelationAnnotator - Starting
> processing ...
> 22 Sep 2017 13:59:59  INFO EventEventRelationAnnotator - Finished
> processing
> 22 Sep 2017 13:59:59  INFO MaxentParserWrapper - Started processing:
> test
> 22 Sep 2017 14:00:02  INFO MaxentParserWrapper - Done parsing: test
> 22 Sep 2017 14:00:03  INFO MentionClusterCoreferenceAnnotator -
> Finding Coreferences ...
> 22 Sep 2017 14:00:03  INFO MentionClusterCoreferenceAnnotator -
> Finished.
> 22 Sep 2017 14:00:03  INFO HtmlTextWriter - Writing HTML to
> D:\Gandhi\ArisG\cTAKES\apache-ctakes-
> 4.0.0\bin_old\test_output\test.txt.pretty.html ...
> 22 Sep 2017 14:00:03  INFO HtmlTextWriter - Finished Writing
> 22 Sep 2017 14:00:03  INFO FileTreeXmiWriter - Writing XMI to
> D:\Gandhi\ArisG\cTAKES\apache-ctakes-
> 4.0.0\bin_old\test_output\test.txt.xmi ...
> Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport
> decreasingWithTrace(51)
> WARNING: Message count: 1; Feature
> org.apache.ctakes.typesystem.type.textsem.Predicate:relations is
> marked multipleReferencesAllowed=false, but it has multiple
> references.  These will be serialized in duplicate. Message count
> indicates messages skipped to avoid potential flooding. Set FINE
> logging level for stacktrace.
> Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport
> decreasingWithTrace(51)
> WARNING: Message count: 2; Feature
> org.apache.ctakes.typesystem.type.textsem.Predicate:relations is
> marked multipleReferencesAllowed=false, but it has multiple
> references.  These will be serialized in duplicate. Message count
> indicates messages skipped to avoid potential flooding. Set FINE
> logging level for stacktrace.
> Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport
> decreasingWithTrace(51)
> WARNING: Message count: 4; Feature
> org.apache.ctakes.typesystem.type.textsem.Predicate:relations is
> marked multipleReferencesAllowed=false, but it has multiple
> references.  These will be serialized in duplicate. Message count
> indicates messages skipped to avoid potential flooding. Set FINE
> logging level for stacktrace.
> Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport
> decreasingWithTrace(51)
> WARNING: Message count: 8; Feature
> org.apache.ctakes.typesystem.type.textsem.Predicate:relations is
> marked multipleReferencesAllowed=false, but it has multiple
> references.  These will be serialized in duplicate. Message count
> indicates messages skipped to avoid potential flooding. Set FINE
> logging level for stacktrace.
> Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport
> decreasingWithTrace(51)
> WARNING: Message count: 16; Feature
> org.apache.ctakes.typesystem.type.textsem.Predicate:relations is
> marked multipleReferencesAllowed=false, but it has multiple
> references.  These will be serialized in duplicate. Message count
> indicates messages skipped to avoid potential flooding. Set FINE
> logging level for stacktrace.
> Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport
> decreasingWithTrace(51)
> WARNING: Message count: 32; Feature
> org.apache.ctakes.typesystem.type.textsem.Predicate:relations is
> marked multipleReferencesAllowed=false, but it has multiple
> references.  These will be serialized in duplicate. Message count
> indicates messages skipped to avoid potential flooding. Set FINE
> logging level for stacktrace.
> 22 Sep 2017 14:00:03  INFO FileTreeXmiWriter - Finished Writing
> 
> 
> Sean,  we have also made some code changes in MeasurementFSM.java to
> identify certain measurements like '20 mg/m2' which was not
> identified out of the box.  Should we send the code changes to you so
> that you can consider the same to be productized ? Please advise.
> 
> Regards,
> Gandhi
> 
> 
> -----Original Message-----
> From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
> Sent: Friday, September 22, 2017 6:54 PM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL] [SUSPICIOUS]
> 
> Hi Gandhi,
> 
> You don't need to add BackwardsTimeAnnotator to your piper.  It is
> added by the TemporalSubPipe.piper.  The  error that you are seeing
> regarding training is very strange, but you can try adding this line
> to the top of the file:
> set isTraining=false
> 
> Can you run a sample file with your piper and send me the log
> statements?  It might help me figure out what is going on.
> 
> > 
> > is there any doc or guide on how to start writing our own
> > annotator.
> There are two example annotators in the ctakes-examples project under
> the ae/ directory.  You can look at those, but I recommend that you
> look at some information on Uimafit, which can be used to create new
> annotators:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__uima.apache.org_
> d_uimafit-
> 2D2.1.0_tools.uimafit.book.pdf&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW1
> 4JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=OlZ5
> SUTgU94HjHE8vZDkXv8hjaaa9qEpAlfZjU52Ymk&s=0rIPMY5osSxL4J9gMymmv0bHsBX
> imd0yb1FmUp4uT-A&e=
> An introduction to creating Analysis Engines (Annotators) is on page
> 5.
> 
> Coding style is individualistic, but below is a rubberstamp that I
> use to get started:
> 
> import org.apache.ctakes.core.pipeline.PipeBitInfo;
> import org.apache.log4j.Logger;
> import org.apache.uima.UimaContext;
> import
> org.apache.uima.analysis_engine.AnalysisEngineProcessException;
> import org.apache.uima.fit.component.JCasAnnotator_ImplBase;
> import org.apache.uima.jcas.JCas;
> import org.apache.uima.resource.ResourceInitializationException;
> 
> /**
>  * @author SPF , chip-nlp
>  * @version %I%
>  * @since 9/22/2017
>  */
> @PipeBitInfo(
>       name = "Template",
>       description = "For Example.", role = PipeBitInfo.Role.ANNOTATOR
> )
> final public class Template extends JCasAnnotator_ImplBase {
> 
>    static private final Logger LOGGER = Logger.getLogger( "Template"
> );
> 
>    /**
>     * {@inheritDoc}
>     */
>    @Override
>    public void initialize( final UimaContext context ) throws
> ResourceInitializationException {
>       // Always call the super first
>       super.initialize( context );
>       // place AE initialization code here
>    }
> 
>    /**
>     * {@inheritDoc}
>     */
>    @Override
>    public void process( final JCas jCas ) throws
> AnalysisEngineProcessException {
>       LOGGER.info( "Processing ..." );
>       // Place AE processing code here
>       LOGGER.info( "Finished." );
>    }
> }
> 
> 
> 
> If you use IntelliJ as your ide you can create a file template with
> these parameters:
> 
> #if (${PACKAGE_NAME} && ${PACKAGE_NAME} != "")package
> ${PACKAGE_NAME};#end
> 
> import org.apache.ctakes.core.pipeline.PipeBitInfo;
> import org.apache.log4j.Logger;
> import org.apache.uima.UimaContext;
> import
> org.apache.uima.analysis_engine.AnalysisEngineProcessException;
> import org.apache.uima.fit.component.JCasAnnotator_ImplBase;
> import org.apache.uima.jcas.JCas;
> import org.apache.uima.resource.ResourceInitializationException;
> 
> #parse("File Header.java")
> @PipeBitInfo(
>       name = "${NAME}",
>       #if ( ${PROJECT_NAME} != "")description = "For
> ${PROJECT_NAME}.",#end
>       role = PipeBitInfo.Role.ANNOTATOR
> )
> final public class ${NAME} extends JCasAnnotator_ImplBase {
> 
>    static private final Logger LOGGER = Logger.getLogger( "${NAME}"
> );
> 
>    /**
>     * {@inheritDoc}
>     */
>    @Override
>    public void initialize( final UimaContext context ) throws
> ResourceInitializationException {
>       // Always call the super first
>       super.initialize( context );
>       // place AE initialization code here
>    }
> 
>    /**
>     * {@inheritDoc}
>     */
>    @Override
>    public void process( final JCas jCas ) throws
> AnalysisEngineProcessException {
>       LOGGER.info( "Processing ..." );
>       // Place AE processing code here
>       LOGGER.info( "Finished." );
>    }
> }
> 
> 
> 
> 
> 
> -----Original Message-----
> From: Gandhi Rajan Natarajan [mailto:Gandhi.Natarajan@arisglobal.com]
> Sent: Friday, September 22, 2017 2:23 AM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL] [SUSPICIOUS]
> 
> Hi Sean,
> 
> Thanks again for the detailed response.
> 
> I still couldn't manage to get superscript-1 co-reference in piper
> GUI.  Also I'm not able to use "BackwardsTimeAnnotator" in piper GUI
> as it gives me the below error:
> 
> org.apache.uima.resource.ResourceInitializationException:
> Initialization of annotator class
> "org.apache.ctakes.temporal.ae.BackwardsTimeAnnotator"
> failed.  (Descriptor: <unknown>)
>         at
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.ini
> tializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:271)
>         at
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.ini
> tialize(PrimitiveAnalysisEngine_impl.java:170)
> Caused by: java.lang.IllegalArgumentException: Please specify
> PARAM_IS_TRAINING - unable to infer it from context
>         at
> org.cleartk.ml.CleartkAnnotator.initialize(CleartkAnnotator.java:109)
> 
> Somewhere in old mails it's mentioned that it's because of missing
> dependencies so I tried adding ClearTkAnnotator with no luck yet. My
> piper file is as follows:
> 
> load AdvancedTokenizerPipeline.piper
> add ContextDependentTokenizerAnnotator
> add POSTagger
> load ChunkerSubPipe.piper
> load DictionarySubPipe.piper
> add org.apache.ctakes.drugner.ae.DrugMentionAnnotator
> load AttributeCleartkSubPipe.piper
> load RelationSubPipe.piper
> load TemporalSubPipe.piper
> load CorefSubPipe.piper
> add org.apache.ctakes.temporal.ae.BackwardsTimeAnnotator
> add pretty.html.HtmlTextWriter
> add FileTreeXmiWriter
> 
> Any suggestion on this? Also I'm using all the latest 4.0.1 cTAKES
> Jars. Regarding the identification of Names, will dig deep on what
> you have mentioned.
> 
> Sorry to ask this as you already mentioned that there are no detailed
> docs for cTAKES. But is there any doc or guide on how to start
> writing our own annotator if required? It not, Is there any simple
> annotator that you would suggest us to look into to get better
> understanding on annotators for us to proceed further.  Thanks in
> advance.
> 
> Regards,
> Gandhi
> 
> 
> -----Original Message-----
> From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
> Sent: Thursday, September 21, 2017 7:59 AM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL] [SUSPICIOUS]
> 
> Hi Gandhi,
> 
> > 
> > We guess we are missing out on something as we could not find co-
> > references for "200mg". Should we add anymore piper for this?
> The piper commands that I sent has everything to obtain
> coreferences.  I use it regularly - it is what I used on your example
> sentence to get the coreferences that I mentioned.
> 
> > 
> > Also the change mentioned in the thread ...
> That is a very old thread and I don't think that it applies to what
> you are trying to do.
> 
> > 
> > We also have a requirement to identify the patient names and sex
> As James said, ctakes isn't really meant to do this.  Ctakes is
> catered toward extracting clinical data, and to this point names have
> not fallen into that category.  It is more a task for general
> nlp.  There is an opennlp model that can identify names and a few
> others (I used to see names using GATE).  ctakes has wrapped opennlp
> for other tasks and you should be able to do the same to adapt an
> engine for names into ctakes.
> 
> > 
> > cTAKES is unable to identify the dates like 20Aug02 or 20/Aug/02 or
> > 06 
> > / 07 / 02 or 27Aug2002
> As Chen mentioned, the BackwardTimeAnnotator module uses an ML model
> trained on gold data.  It isn't perfect.  You can add another time
> annotator on top of this to get some of the more simply formatted
> date mentions - there are a lot of them out there.  Personally I have
> used jchronic as it can be easily tweaked to recognize medically-
> relevant temporal expressions relating to surgery, pharmacology, etc.
> 
> Sean
> 
> 
> -----Original Message-----
> From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
> Sent: Wednesday, September 20, 2017 8:50 AM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL] [SUSPICIOUS]
> 
> Hi Gandhi,
> 
> I don't have time to go through all of this right now, but I will try
> to get to it soon.
> 
> Make sure that you are running the latest version in trunk.
> 
> Sean
> 
> -----Original Message-----
> From: Gandhi Rajan Natarajan [mailto:Gandhi.Natarajan@arisglobal.com]
> Sent: Wednesday, September 20, 2017 7:03 AM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL]
> 
> Hi, Could someone help me out on the below queries please?
> 
> Regards,
> Gandhi
> 
> -----Original Message-----
> From: Gandhi Rajan Natarajan [mailto:Gandhi.Natarajan@arisglobal.com]
> Sent: Tuesday, September 19, 2017 8:51 PM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL]
> 
> Hi Sean,
> 
> Thanks again for the detailed and prompt response. We were able to
> run the piper GUI as per your advice. But in the output (The patient
> started study treatment of Thalomid 200mg ( days 1 - 21 ) , and
> Epirubicin ,20 mg / m2 ( days 1 , 8 , and 15 ) on 06 / 07 / 02 for
> the treatment of hepatocellular carcinoma.), we were not able to find
> superscript-1 as you mentioned earlier but could find superscript-2,
> 3 etc.  We guess we are missing out on something as we could not find
> co-references for "200mg". Should we add anymore piper for this?
> 
> Also the change mentioned in the thread - https://urldefense.proofpoi
> nt.com/v2/url?u=http-3A__mail-2Darchives.apache.org_mod-
> 5Fmbox_ctakes-2Duser_201403.mbox_-253CCAL6WimrJ-5Fmm1-
> 2BXyggBZv62diYuWP0ScA9VEV8mNHGWe4hSNHQg-40mail.gmail.com-
> 253E&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67Gvl
> GZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=JoUDRZHu91gGMslwknPzTQC_UG2LEB
> LyOfXR3ikwOL0&s=GzhvIkBu4cgyzYN9n6VLe2rz4sJhJzMxDcWyB0BkqAc&e=  is
> required for the drug-ner module to identify drug-ner annotations.
> 
> 1) We also have a requirement to identify the patient names and sex
> available in narrative texts. Please let us know how to achieve the
> same as its not identifying the proper nouns and the relationship
> with the patient?
> Eg. "This male patient named Tom Hardy aged 35 years is participating
> in a Non-IND study"
> 
> 2) cTAKES is unable to identify the dates like 20Aug02 or 20/Aug/02
> or 06 / 07 / 02 or 27Aug2002 as in the below example. Please let us
> know how to enhance the system to identify such date patterns.
> E.g " On 20Aug02, the investigator noted that this patient was
> suffering worsening fatigue and got tired getting out of his chair"
> 
> Regards,
> Gandhi
> 
> 
> -----Original Message-----
> From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
> Sent: Monday, September 18, 2017 10:02 PM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL]
> 
> Hi Gandhi,
> 
> > 
> > So in this case will be able to see drug attributes in the output
> > XML?
> As long as you have the DrugMentionAnnotator in your pipeline you
> should be able to find drug attributes in the xml output file.
> 
> > 
> > we also saw some code changes needs to be done to use drug-ner
> > module. Is it still valid?
> As far as I know there aren't any necessary code changes to get drug
> ner running.  However, I do not normally use drugner so I can't say
> for certain.
> 
> > 
> > Also you mentioned that the drun-ner module is out of date
> It can still be used and will produce annotations.  All that I meant
> was that there may not be many people out there using it.  It is not
> part of the default pipeline.
> 
>   > You also mentioned that when you run the sentence, the date was
> identified. Where and how exactly did you ran it so that we can check
> the same?
> I run the following in a piper file because I am interested in a lot
> of modules (I added drugner just for you):
> 
> // Advanced Tokenization: Regex sectionization, BIO Sentence Detector
> (lumper), Paragraphs, Lists load AdvancedTokenizerPipeline.piper add
> ContextDependentTokenizerAnnotator
> add POSTagger
> // Chunkers
> load ChunkerSubPipe.piper
> // Default fast dictionary lookup
> load DictionarySubPipe.piper
> add org.apache.ctakes.drugner.ae.DrugMentionAnnotator
> // Cleartk Entity Attributes
> load AttributeCleartkSubPipe.piper
> // Relations
> load RelationSubPipe.piper
> // Temporal
> load TemporalSubPipe.piper
> // Coreferences
> load CorefSubPipe.piper
> // Html output
> add pretty.html.HtmlTextWriter
> 
> For information on piper files, see https://urldefense.proofpoint.com
> /v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Piper-
> 2BFiles&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67
> GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=JoUDRZHu91gGMslwknPzTQC_UG2
> LEBLyOfXR3ikwOL0&s=9ueuHYwEywok8byBXEkVjmTWiChmaIY3ryB4Pi6ajRo&e=
> I run it in my IDE with:
> org.apache.ctakes.core.pipeline.PiperFileRunner -Xmx3G -p
> <FileAsAbove>.piper -i org/apache/ctakes/examples/notes -o
> <OutputDir> --user <MyUmlsUser> --pass <MyUmlsPass> You can run it by
> command line by substituting
> "org.apache.ctakes.core.pipeline.PiperFileRunner -Xmx3G" with
> "bin/runPiperFile".
> You can also run it through a ctakes 4.01 (trunk) gui.  See https://u
> rldefense.proofpoint.com/v2/url?u=https-
> 3A__cwiki.apache.org_confluence_display_CTAKES_Piper-2BFile-
> 2BSubmitter-
> 2BGUI&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67Gv
> lGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=JoUDRZHu91gGMslwknPzTQC_UG2LE
> BLyOfXR3ikwOL0&s=VWIrXrfA2dZ8KHOdoizJo-nTx7nPSy4GDOZ7IxQteIQ&e=
> 
> > 
> > I'm not able to see any clickable option in HTML output
> You must have the HtmlTextWriter at the end of your pipeline to
> produce html files.  To keep the xml file output, place "add
> FileTreeXmiWriter" at the end of the piper.
> 
> > 
> > Apologizes for too many
> No worries, we are happy to have your interest!
> 
> Sean
> 
> 
> -----Original Message-----
> From: Gandhi Rajan Natarajan [mailto:Gandhi.Natarajan@arisglobal.com]
> Sent: Saturday, September 16, 2017 7:01 AM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL]
> 
> Hi Sean,
> 
> Thanks again for the prompt response. Appreciate your input on adding
> DrugMentionAnnotator. Actually, we are relying on pretty printer
> output just to understand the analysis. Our logic to extract
> disorders and findings are based on the XML file generated by https:/
> /urldefense.proofpoint.com/v2/url?u=https-
> 3A__github.com_healthnlp_examples_blob_master_ctakes-2Dtemporal-
> 2Ddemo_src_main_java_org_apache_ctakes_web_client_servlet_DemoServlet
> .java&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67Gv
> lGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=_MJKBj93YJdd5aa84dBvqtg6o-
> BKBn7UcbfF660CEBI&s=g8UzBHRoOyn1hoRABKSC6EtPMvwOSSggviRmWCHKti4&e=   
> So in this case will be able to see drug attributes in the output
> XML?
> 
> In one of the old post (https://urldefense.proofpoint.com/v2/url?u=ht
> tp-3A__mail-2Darchives.apache.org_mod-5Fmbox_ctakes-
> 2Duser_201403.mbox_-253CCAL6WimrJ-5Fmm1-
> 2BXyggBZv62diYuWP0ScA9VEV8mNHGWe4hSNHQg-40mail.gmail.com-
> 253E&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67Gvl
> GZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=_MJKBj93YJdd5aa84dBvqtg6o-
> BKBn7UcbfF660CEBI&s=iT_1UGR98APO80UaZsaCBHseMqF4M4PfItgokD27r5c&e=  )
> we also saw some code changes needs to be done to use drug-ner
> module. Is it still valid? Also you mentioned that the drun-ner
> module is out of date which means it cannot be used or it may not
> provide accurate analysis? Also what changes needs to be done to
> bring it up to date so that we can try the same if you can assist?
> 
> You also mentioned that when you run the sentence, the date was
> identified. Where and how exactly did you ran it so that we can check
> the same? Also regarding you explanation on corefernce, I'm not able
> to see any clickable option in HTML output. So wanted to understand
> how can we run and check that too.
> 
> Apologizes for too many questions as we are just a week old in NLP
> and cTAKES. Thanks in advance.
> 
> Regards,
> Gandhi
> 
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> or system manager by email immediately if you have received this e-
> mail by mistake and delete this e-mail from your system. If you are
> not the intended recipient you are notified that disclosing, copying,
> distributing or taking any action in reliance on the contents of this
> information is strictly prohibited and against the law.
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> or system manager by email immediately if you have received this e-
> mail by mistake and delete this e-mail from your system. If you are
> not the intended recipient you are notified that disclosing, copying,
> distributing or taking any action in reliance on the contents of this
> information is strictly prohibited and against the law.
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> or system manager by email immediately if you have received this e-
> mail by mistake and delete this e-mail from your system. If you are
> not the intended recipient you are notified that disclosing, copying,
> distributing or taking any action in reliance on the contents of this
> information is strictly prohibited and against the law.
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> or system manager by email immediately if you have received this e-
> mail by mistake and delete this e-mail from your system. If you are
> not the intended recipient you are notified that disclosing, copying,
> distributing or taking any action in reliance on the contents of this
> information is strictly prohibited and against the law.
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> or system manager by email immediately if you have received this e-
> mail by mistake and delete this e-mail from your system. If you are
> not the intended recipient you are notified that disclosing, copying,
> distributing or taking any action in reliance on the contents of this
> information is strictly prohibited and against the law.
> 
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> or system manager by email immediately if you have received this e-
> mail by mistake and delete this e-mail from your system. If you are
> not the intended recipient you are notified that disclosing, copying,
> distributing or taking any action in reliance on the contents of this
> information is strictly prohibited and against the law.
> 
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> or system manager by email immediately if you have received this e-
> mail by mistake and delete this e-mail from your system. If you are
> not the intended recipient you are notified that disclosing, copying,
> distributing or taking any action in reliance on the contents of this
> information is strictly prohibited and against the law.
> 

RE: Enabling drugner pipeline and identifying dates [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS]

Posted by Gandhi Rajan Natarajan <Ga...@arisglobal.com>.
Thanks Sean and Tim. Will ping back if I don’t hear from you guys in a week's time. Thanks for all the response.

Regards,
Gandhi

-----Original Message-----
From: Miller, Timothy [mailto:Timothy.Miller@childrens.harvard.edu]
Sent: Saturday, September 30, 2017 12:45 AM
To: dev@ctakes.apache.org
Subject: Re: Enabling drugner pipeline and identifying dates [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS]

It is a very busy time for me but this is on my todo list. Don't be afraid to ping in a week or so if you don't hear anything.

Tim

On Fri, 2017-09-29 at 14:04 +0000, Finan, Sean wrote:
> Hi Gandhi,
> >
> > Did you mean that with the text I sent, the co-reference
> > superscript-1 will be lost?
> Yes.  Well, to be more clear, the coreference that was resolved as #1
> in your original sentence alone will be lost.  However, there are
> eight or none coreference chains discovered in your full paragraph,
> and one of those will have superscript 1s.
>
> >
> > Could someone have a look and know your thoughts please?
> Thank you for creating the jira and the patch.  I am sure that
> somebody will take a look.
>
> Thanks,
> Sean
>
>
> -----Original Message-----
> From: Gandhi Rajan Natarajan [mailto:Gandhi.Natarajan@arisglobal.com]
>
> Sent: Friday, September 29, 2017 2:25 AM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL] [SUSPICIOUS]
>
> Hi Sean,
>
> Thanks again for the response. I guess its mistake from my side that I
> dint send the complete text. Did you mean that with the text I sent,
> the co-reference superscript-1 will be lost?
>
> Also as per your advice, We have created an issue  - https://urldefen
> se.proofpoint.com/v2/url?u=https-
> 3A__issues.apache.org_jira_browse_CTAKES-
> 2D459&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67Gv
> lGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=iyJsQ5ekdL7Vf_wcjADsUYBjMaVho
> hpozRybEEpwNUg&s=KHAFRjKk4tjMJGHaIjrUuqk6XAtVFYP0sVuN5ODLs3Q&e=   for
> measurement FSM changes and attached the modified file changes. Could
> someone have a look and know your thoughts please?
>
> Regards,
> Gandhi
>
>
> -----Original Message-----
> From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
> Sent: Thursday, September 28, 2017 8:21 PM
> To: dev@ctakes.apache.org
> Cc: Miller, Timothy <Ti...@childrens.harvard.edu>
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL] [SUSPICIOUS]
>
> Hi Gandhi,
>
> I don't recall you sending me that entire snippet of text.  I think
> that I only had your single example sentence.
> You have discovered one of the quirks of software: "change the data,
> change the result."
> Ctakes is a system with many moving parts.  Things that precede or
> follow your original example sentence will change the evaluation of
> that sentence.
> With the pipeline you are using and the full note, you should see a
> number (mine is 4) next to the first "thalomid" in the original
> example sentence.  If you click that number you should see (to the
> right) 4 instances of "thalomid".
> Tim can correct me here, but maybe the coreference module ranked the
> links between "thalomid" as much higher than the rank between "study
> treatment of thalomid 200mg" and "the treatment of hepatocellular
> carcinoma" and discarded the encapsulating treatment texts from
> markables?  It is probably more complex than that.
>
> >
> > we have also made some code changes in MeasurementFSM.java to
> > identify certain measurements like '20 mg/m2' which was not
> > identified out of the box.  Should we send the code changes to you
> > so that you can consider the same to be productized ? Please
> > advise."
> I don't know if you've noticed the recent emails on the dev list
> involving Alexandru Zbarcea.  Alex has been creating or commenting on
> Jira items and attaching code for  fixes and enhancements.  This is a
> widely used process and is fairly easy to follow.   I think that the
> following links are relevant:
> Working with issues:  https://urldefense.proofpoint.com/v2/url?u=http
> s-3A__confluence.atlassian.com_jiracoreserver073_working-2Dwith-
> 2Dissues-
> 2D861257307.html&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxe
> FU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=iyJsQ5ekdL7Vf_wcjA
> DsUYBjMaVhohpozRybEEpwNUg&s=2BFHffDc3fS5DTAXq3M5MsGBv_uG0t3MceVT38alp
> 2Q&e=
> Creating patches:   https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__confluence.atlassian.com_crucible_creating-2Dpatch-2Dfiles-2Dfor-
> 2Dpre-2Dcommit-2Dreviews-
> 2D298977458.html&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxe
> FU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=iyJsQ5ekdL7Vf_wcjA
> DsUYBjMaVhohpozRybEEpwNUg&s=JXOJanO4pjISmYVdCpcTLHD72n0_wzJMa7xrYDT1G
> yc&e=
> Attaching files:   https://urldefense.proofpoint.com/v2/url?u=https-3
> A__confluence.atlassian.com_jiracorecloud_attaching-2Dfiles-2Dand-
> 2Dscreenshots-2Dto-2Dissues-
> 2D765593805.html&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxe
> FU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=iyJsQ5ekdL7Vf_wcjA
> DsUYBjMaVhohpozRybEEpwNUg&s=WT5NtwXSeAbZOb6iAojfglU5OKMnCTmyyo1HUUggC
> rE&e=
>
> I don't know if you have a jira account and permissions for the ctakes
> project.  An administrator may need to set that up for you.
>
> Thanks,
> Sean
>
> -----Original Message-----
> From: Gandhi Rajan Natarajan [mailto:Gandhi.Natarajan@arisglobal.com]
> Sent: Thursday, September 28, 2017 4:09 AM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL] [SUSPICIOUS]
>
> Hi Sean,
>
> Thanks for the response. I was able to see the co-reference
> superscript using the html file that you sent. Interestingly even I
> was able to generate the sample HTML using  piper GUI by  having only
> that single line - " The patient started study treatment of Thalomid
> 200mg (days 1-21), and Epirubicin, 20 mg/m2 (days 1, 8, and 15) on
> 06/07/02 for the treatment of hepatocellular carcinoma. " in the input
> file.
>
> But when I change the input file content with the following lines:
>
> "This patient is participating in a Non-IND study; Protocol CG-
> 000424: "Phase I/II of Thalidomide and Epirubicin in Patients with
> Unresectable or Metastatic Hepatocellular Carcinoma".Information has
> been received from the investigator regarding an 82 year-old male
> patient who had gastrointestinal bleeding while on Thalomid,
> Epirubicin, and Coumadin. He had a past medical history of
> diverticulosis in 03/02 and a right atrial clot from intraventricular
> catheter (IVC) for which he was started on Coumadin. During the
> hospitalization for a right atrial clot in 03/02 hepatocellular
> carcinoma was first noted and he was referred to an oncologist.  The
> patient started study treatment of Thalomid 200mg (days 1-21), and
> Epirubicin, 20 mg/m2 (days 1, 8, and 15) on 06/07/02 for the treatment
> of hepatocellular carcinoma.  He was concomitantly receiving Cardura,
> Ambien (for insomnia), Megace, Coumadin, and Oxycodone. This patient
> presented to the emergency room with the chief complaint of
> hematochezia. He reported noticing bright red blood and small clots
> mixed in with his stool. On 07/13/02, he was admitted due to
> gastrointestinal bleed.  The physician ordered 2 large bore
> intravenous lines and planned to transfuse for hematocrit less than
> 30%. Due to the  INR (international normalized ratio) level of 3.0,
> Coumadin was held. He was also noted to have bilateral lower extremity
> edema with dyspnea on exertion.  On 07/13/02, he had a chest X-ray PA
> and lateral done that showed no evidence of acute pneumonia or
> congestive heart failure.  On 07/14/02, he underwent  an ultrasound
> which was negative for deep vein thrombosis. This patient did not take
> Thalomid on the day of his admittance to the hospital, but resumed
> treatment shortly after with no return of symptoms. On 07/15/02, he
> was discharged in stable condition. There have been no further reports
> of bleeding at this time. Thedoctor has assessed the hematochezia as
> related to Coumadin treatment and previously diagnosed diverticulosis,
> and not to protocol therapy with Thalomid and Epirubicin.Additional
> information received from the investigator on 27Aug02 reveals that
> this male patient began on 07Jun02 two cycles of therapy with
> Thalidomide and Epirubicin.  His post cycle two computed tomography
> scans revealed increase in size of liver lesion with development of
> multiple new satellite nodules.  On 29Jul02, the investigator removed
> this patient from protocol for progressive disease and recommended
> hospice care.  After seeking a second opinion from two other
> institutions, this patient was admitted to hospice on 05Aug02.  On
> 20Aug02, the investigator noted that this patient was suffering
> worsening fatigue and got tired getting out of his chair.  On 25Aug02,
> this patient died due to disease progression.  The investigator
> assessed the death as not related to study treatment and expected"
>
> The co-reference superscript is lost by then. Did you tried with the
> complete text above by any chance in your piper GUI? Also I guess you
> did not notice the question on my last post - " Sean, we have also
> made some code changes in MeasurementFSM.java to identify certain
> measurements like '20 mg/m2' which was not identified out of the box.
> Should we send the code changes to you so that you can consider the
> same to be productized ? Please advise."
>
>
> Regards,
> Gandhi
>
>
> -----Original Message-----
> From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
> Sent: Wednesday, September 27, 2017 5:53 PM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL] [SUSPICIOUS]
>
> Hi Gandhi,
>
> I am glad that you are feeling better.
> I don't understand why you aren't getting the same output as me.  I
> just ran your example sentence with your piper with a fresh checkout
> and get the html below.  The css follows.  Copy and paste into a file
> and see if you see the corefs.
>
> /////////////////////////////////////////////////////  html, copy into
> file  /////////////////////////////////////////////////
>
> <!DOCTYPE html>
> <html>
> <head>
>   <title>OneLiner Output</title>
> </head>
> <body>
> <link rel="stylesheet" href="ctakes.pretty.css" type="text/css"
> media="screen"> <h2>OneLiner</h2>  <i>Text processing finished on: 9
> 27 2017, 08:15:31</i> <hr>
>
> <div id="content">
>
> <p>
> The patient <span class="AFF_"
> onClick="iaf('AFF_NL_EVTNL_startedNL_SPC_[before] doc timeNL_NL_')"
> TIP="Event ">started</span> study <span class="AFF_"
> onClick="iaf('AFF_NL_EVTNL_treatmentNL_SPC_[before] doc
> timeNL_NL_PRCNL_treatmentNL_SPC_C0087111NL_SPC_[Therapeutic
> procedure]NL_SPC_[before] doc timeNL_NL_')" TIP="Event Procedure
> ">treatment</span><span class="PRC"><sup>&bull;</sup></span> of <span
> class="AFF_"
> onClick="iaf('AFF_NL_DRGNL_ThalomidNL_SPC_C0723668NL_SPC_[before] doc
> timeNL_NL_')" TIP="Drug ">Thalomid</span><span
> class="DRG"><sup>&bull;</sup></span> <span class="AFF_"
> onClick="iaf('AFF_NL_EVTNL_200mgNL_SPC_[before] doc timeNL_NL_')"
> TIP="Event ">200mg</span><span class="UNK"
> onClick="crf1()"><sup>1</sup></span> ( <span class="GNR_"
> onClick="iaf('GNR_NL_TMXNL_daysNL_NL_')" TIP="Time ">days</span> 1 -
> 21 ) , and <span class="AFF_"
> onClick="iaf('AFF_NL_DRGNL_EpirubicinNL_SPC_C0014582NL_SPC_[before]
> doc timeNL_NL_')" TIP="Drug ">Epirubicin</span><span
> class="DRG"><sup>&bull;</sup></span> , 20 mg / m2 ( <span class="GNR_"
> onClick="iaf('GNR_NL_TMXNL_days 1 , 8NL_NL_')" TIP="Time ">days 1 ,
> 8</span> , and 15 ) on <span class="GNR_"
> onClick="iaf('GNR_NL_TMXNL_06 / 07 / 02NL_SPC_[CONTAINS]
> treatmentNL_NL_')" TIP="Time ">06 / 07 / 02</span> for the <span
> class="AFF_" onClick="iaf('AFF_NL_EVTNL_treatmentNL_SPC_[before] doc
> timeNL_SPC_06 / 07 / 02
> [CONTAINS]NL_NL_PRCNL_treatmentNL_SPC_C0087111NL_SPC_[Therapeutic
> procedure]NL_SPC_[before] doc timeNL_NL_')" TIP="Event Procedure
> ">treatment</span><span class="PRC"><sup>&bull;</sup></span> of <span
> class="AFF_" onClick="iaf('AFF_NL_DISNL_hepatocellular
> carcinomaNL_SPC_C2239176NL_SPC_[Liver carcinoma]NL_SPC_[before] doc
> timeNL_NL_')" TIP="Disorder ">hepatocellular </span><span class="AFF_"
> onClick="iaf('AFF_NL_DISNL_hepatocellular
> carcinomaNL_SPC_C2239176NL_SPC_[Liver carcinoma]NL_SPC_[before] doc
> timeNL_NL_EVTNL_carcinomaNL_SPC_[before] doc timeNL_NL_')"
> TIP="Disorder Event ">carcinoma</span><span class="DIS"
> onClick="crf1()"><sup>1</sup></span> .
> <br>
>
> </p>
>
> </div>
>
> <div id="ia"> Annotation Information </div> <script
> type="text/javascript">
>   function iaf(txt) {
>     var aff=txt.replace( /AFF_/g,"<br><h3>Affirmed</h3>" );
>     var neg=aff.replace( /NEG_/g,"<br><h3>Negated</h3>" );
>     var unc=neg.replace( /UNC_/g,"<br><h3>Uncertain</h3>" );
>     var unn=unc.replace( /UNN_/g,"<br><h3>Uncertain, Negated</h3>" );
>     var ant=unn.replace( /ANT/g,"<b>Anatomical Site</b>" );
>     var dis=ant.replace( /DIS/g,"<b>Disease/ Disorder</b>" );
>     var fnd=dis.replace( /FND/g,"<b>Sign/ Symptom</b>" );
>     var prc=fnd.replace( /PRC/g,"<b>Procedure</b>" );
>     var drg=prc.replace( /DRG/g,"<b>Medication</b>" );
>     var evt=drg.replace( /EVT/g,"<b>Event</b>" );
>     var tmx=evt.replace( /TMX/g,"<b>Time</b>" );
>     var unk=tmx.replace( /UNK/g,"<b>Unknown</b>" );
>     var spc=unk.replace(
> /SPC_/g,"&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;" );
>     var prf1=spc.replace( /\[/g,"<i>" );
>     var prf2=prf1.replace( /\]/g,"</i>" );
>     var nl=prf2.replace( /NL_/g,"<br>" );
>     document.getElementById("ia").innerHTML = nl;
>   }
>   function crf1() {
>     document.getElementById("ia").innerHTML = "<br><h3>Coreference
> Chain</h3>study treatment of Thalomid 200mg<br>the treatment of
> hepatocellular carcinoma";
>   }
> </script></body>
> </html>
>
>
>
> /////////////////////////////////////////////////////  css, copy into
> file named ctakes.pretty.css in same directory as html
> /////////////////////////////////////////////////
>
>
>
> .GNR_ {
>   position: relative;
>   display: inline-block gray;
>   border-bottom: 0.10em solid gray;
> }
>
> .AFF_ {
>   position: relative;
>   display: inline-block green;
>   border-bottom: 0.15em solid green;
> }
>
> .UNC_ {
>   position: relative;
>   display: inline-block gold;
>   border-bottom: 0.16em dotted gold;
> }
>
> .NEG_ {
>   position: relative;
>   display: inline-block red;
>   border-bottom: 0.16em dashed red;
> }
>
> .UNN_ {
>   position: relative;
>   display: inline-block orange;
>   border-bottom: 0.16em dashed orange; }
>
> .FND {
>   color: magenta;
> }
>
> .DIS {
>   color: black;
> }
>
> .DRG {
>   color: red;
> }
>
> .PRC {
>   color: blue;
> }
>
> .ANT {
>   color: gray;
> }
>
> .UNK {
>   color: gray;
> }
>
> [TIP] {
>   position: relative;
>   z-index: 2;
>   cursor: pointer;
> }
> [TIP]::before,
> [TIP]::after {
>   visibility: hidden;
>   -ms-filter: "progid:DXImageTransform.Microsoft.Alpha(Opacity=0)";
>   filter: progid: DXImageTransform.Microsoft.Alpha(Opacity=0);
>   opacity: 0;
>   pointer-events: none;
> }
> [TIP]::before {
>   position: absolute;
>   bottom: 0%;
>   left: 100%;
>   margin-bottom: 5px;
>   padding: 7px;
>   -webkit-border-radius: 3px;
>   -moz-border-radius: 3px;
>   border-radius: 3px;
>   background-color: #000;
>   background-color: hsla(0, 0%, 20%, 0.9);
>   color: #fff;
>   content: attr(TIP);
>   text-align: center;
>   font-size: 14px;
>   line-height: 1.2;
> }
> [TIP]:hover::before,
> [TIP]:hover::after {
>   visibility: visible;
>   -ms-filter: "progid:DXImageTransform.Microsoft.Alpha(Opacity=100)";
>   filter: progid: DXImageTransform.Microsoft.Alpha(Opacity=100);
>   opacity: 1;
> }
>
> div#ia {
>   position: fixed;
>   top: 0;
>   right: 0;
>   width: 20%;
>   height: 100%;
>   padding: 10px;
>   overflow: auto;
>   background-color: lightgray;
> }
>
> div#content {
>   width: 79%;
>   height: 100%;
>   padding: 10px;
>   overflow: auto;
> }
>
>
>
>
>
>
>
>
>
> -----Original Message-----
> From: Gandhi Rajan Natarajan [mailto:Gandhi.Natarajan@arisglobal.com]
> Sent: Wednesday, September 27, 2017 4:40 AM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL] [SUSPICIOUS]
>
> Hi Sean,
>
> Sorry for the delayed response as I was out of office due to illness.
> If I don't add BackwardsTimeAnnotator, I don't see any error related
> to isTraining param. But still couldn't get the superscript co-
> reference working. Please note that I am using the latest 4.0.1 jars.
> The piper file and console log messages are as follows:
>
> PIPER FILE:
> // Advanced Tokenization: Regex sectionization, BIO Sentence Detector
> (lumper), Paragraphs,Lists load AdvancedTokenizerPipeline.piper add
> ContextDependentTokenizerAnnotator
> add POSTagger
> // Chunkers
> load ChunkerSubPipe.piper
> // Default fast dictionary lookup
> load DictionarySubPipe.piper
> add org.apache.ctakes.drugner.ae.DrugMentionAnnotator
> // Cleartk Entity Attributes
> load AttributeCleartkSubPipe.piper
> // Relations
> load RelationSubPipe.piper
> // Temporal
> load TemporalSubPipe.piper
> // Coreferences
> load CorefSubPipe.piper
> //add org.apache.ctakes.temporal.ae.BackwardsTimeAnnotator
> // Html output
> add pretty.html.HtmlTextWriter
> // XMl writer
> add FileTreeXmiWriter
>
> CONSOLE LOG:
>
> 22 Sep 2017 13:59:44  INFO ClearNLPSemanticRoleLabelerAE - Finished
> initializing
> 22 Sep 2017 13:59:44  INFO CleartkAnalysisEngine - Starting
> initializing for Assigning Attributes
> 22 Sep 2017 13:59:46  INFO CleartkAnalysisEngine - Finished
> initializing
> 22 Sep 2017 13:59:46  INFO ModifierExtractorAnnotator - Starting
> initializing
> 22 Sep 2017 13:59:46  INFO ModifierExtractorAnnotator - Finished
> initializing
> 22 Sep 2017 13:59:46  INFO DegreeOfRelationExtractorAnnotator -
> Starting initializing
> 22 Sep 2017 13:59:46  INFO DegreeOfRelationExtractorAnnotator -
> Finished initializing
> 22 Sep 2017 13:59:46  INFO LocationOfRelationExtractorAnnotator -
> Starting initializing
> 22 Sep 2017 13:59:46  INFO LocationOfRelationExtractorAnnotator -
> Finished initializing
> 22 Sep 2017 13:59:46  INFO BackwardsTimeAnnotator - Starting
> initializing
> 22 Sep 2017 13:59:46  INFO BackwardsTimeAnnotator - Finished
> initializing
> 22 Sep 2017 13:59:46  INFO DocTimeRelAnnotator - Starting initializing
> 22 Sep 2017 13:59:48  INFO DocTimeRelAnnotator - Finished initializing
> 22 Sep 2017 13:59:48  INFO EventTimeRelationAnnotator - Starting
> initializing
> 22 Sep 2017 13:59:49  INFO EventTimeRelationAnnotator - Finished
> initializing
> 22 Sep 2017 13:59:49  INFO EventEventRelationAnnotator - Starting
> initializing
> 22 Sep 2017 13:59:51  INFO EventEventRelationAnnotator - Finished
> initializing
> 22 Sep 2017 13:59:51  INFO ConstituencyParser - Initializing parser...
> 22 Sep 2017 13:59:54  INFO RegexSectionizer - Annotating Sections ...
> 22 Sep 2017 13:59:55  INFO RegexSectionizer - Finished processing
> 22 Sep 2017 13:59:55  INFO SentenceDetectorAnnotatorBIO - Starting
> processing ...
> 22 Sep 2017 13:59:55  INFO SentenceDetectorAnnotatorBIO - Finished
> processing
> 22 Sep 2017 13:59:55  INFO ParagraphAnnotator - Annotating Paragraphs
> ...
> 22 Sep 2017 13:59:55  INFO ParagraphAnnotator - Finished processing
> 22 Sep 2017 13:59:55  INFO ParagraphSentenceFixer - Adjusting
> Sentences overlapping Paragraphs ...
> 22 Sep 2017 13:59:55  INFO ParagraphSentenceFixer - Finished
> Processing
> 22 Sep 2017 13:59:55  INFO ListAnnotator - Annotating Lists ...
> 22 Sep 2017 13:59:55  INFO ListAnnotator - Finished processing
> 22 Sep 2017 13:59:55  INFO ListSentenceFixer - Adjusting Sentences
> overlapping Lists ...
> 22 Sep 2017 13:59:55  INFO ListSentenceFixer - Finished Processing
> 22 Sep 2017 13:59:55  INFO TokenizerAnnotatorPTB - process(JCas) in
> org.apache.ctakes.core.ae.TokenizerAnnotatorPTB
> 22 Sep 2017 13:59:55  INFO ContextDependentTokenizerAnnotator -
> process(JCas)
> 22 Sep 2017 13:59:55  INFO POSTagger - process(JCas)
> 22 Sep 2017 13:59:55  INFO Chunker -  process(JCas)
> 22 Sep 2017 13:59:55  INFO ChunkAdjuster -  process(JCas)
> 22 Sep 2017 13:59:55  INFO ChunkAdjuster -  process(JCas)
> 22 Sep 2017 13:59:55  INFO AbstractJCasTermAnnotator - Finding Named
> Entities ...
> 22 Sep 2017 13:59:55  INFO AbstractJCasTermAnnotator - Finished
> processing
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - process dev (JCas)
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:55  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:56  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:56  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:56  INFO DrugMentionAnnotator - -1
> 22 Sep 2017 13:59:56  INFO ClearNLPDependencyParserAE - Dependency
> parser starting with thread:pool-2-thread-1
> 22 Sep 2017 13:59:56  INFO ClearNLPDependencyParserAE - Dependency
> parser ending with thread:pool-2-thread-1
> 22 Sep 2017 13:59:56  INFO ClearNLPSemanticRoleLabelerAE - Starting
> processing ...
> 22 Sep 2017 13:59:56  INFO ClearNLPSemanticRoleLabelerAE - Finished
> processing
> 22 Sep 2017 13:59:56  INFO CleartkAnalysisEngine - Assigning
> Attributes ...
> 22 Sep 2017 13:59:56  INFO CleartkAnalysisEngine - Finished Assigning
> Attributes
> 22 Sep 2017 13:59:56  INFO ModifierExtractorAnnotator - Starting
> processing ...
> 22 Sep 2017 13:59:56  INFO ModifierExtractorAnnotator - Finished
> processing
> 22 Sep 2017 13:59:56  INFO DegreeOfRelationExtractorAnnotator -
> Starting processing ...
> 22 Sep 2017 13:59:56  INFO DegreeOfRelationExtractorAnnotator -
> Finished processing
> 22 Sep 2017 13:59:56  INFO LocationOfRelationExtractorAnnotator -
> Starting processing ...
> 22 Sep 2017 13:59:57  INFO LocationOfRelationExtractorAnnotator -
> Finished processing
> 22 Sep 2017 13:59:57  INFO BackwardsTimeAnnotator - Starting
> processing ...
> 22 Sep 2017 13:59:57  INFO BackwardsTimeAnnotator - Finished
> processing
> 22 Sep 2017 13:59:57  INFO DocTimeRelAnnotator - Starting processing
> ...
> 22 Sep 2017 13:59:58  INFO DocTimeRelAnnotator - Finished processing
> 22 Sep 2017 13:59:58  INFO EventTimeRelationAnnotator - Starting
> processing ...
> 22 Sep 2017 13:59:59  INFO EventTimeRelationAnnotator - Finished
> processing
> 22 Sep 2017 13:59:59  INFO EventEventRelationAnnotator - Starting
> processing ...
> 22 Sep 2017 13:59:59  INFO EventEventRelationAnnotator - Finished
> processing
> 22 Sep 2017 13:59:59  INFO MaxentParserWrapper - Started processing:
> test
> 22 Sep 2017 14:00:02  INFO MaxentParserWrapper - Done parsing: test
> 22 Sep 2017 14:00:03  INFO MentionClusterCoreferenceAnnotator -
> Finding Coreferences ...
> 22 Sep 2017 14:00:03  INFO MentionClusterCoreferenceAnnotator -
> Finished.
> 22 Sep 2017 14:00:03  INFO HtmlTextWriter - Writing HTML to
> D:\Gandhi\ArisG\cTAKES\apache-ctakes-
> 4.0.0\bin_old\test_output\test.txt.pretty.html ...
> 22 Sep 2017 14:00:03  INFO HtmlTextWriter - Finished Writing
> 22 Sep 2017 14:00:03  INFO FileTreeXmiWriter - Writing XMI to
> D:\Gandhi\ArisG\cTAKES\apache-ctakes-
> 4.0.0\bin_old\test_output\test.txt.xmi ...
> Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport
> decreasingWithTrace(51)
> WARNING: Message count: 1; Feature
> org.apache.ctakes.typesystem.type.textsem.Predicate:relations is
> marked multipleReferencesAllowed=false, but it has multiple
> references.  These will be serialized in duplicate. Message count
> indicates messages skipped to avoid potential flooding. Set FINE
> logging level for stacktrace.
> Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport
> decreasingWithTrace(51)
> WARNING: Message count: 2; Feature
> org.apache.ctakes.typesystem.type.textsem.Predicate:relations is
> marked multipleReferencesAllowed=false, but it has multiple
> references.  These will be serialized in duplicate. Message count
> indicates messages skipped to avoid potential flooding. Set FINE
> logging level for stacktrace.
> Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport
> decreasingWithTrace(51)
> WARNING: Message count: 4; Feature
> org.apache.ctakes.typesystem.type.textsem.Predicate:relations is
> marked multipleReferencesAllowed=false, but it has multiple
> references.  These will be serialized in duplicate. Message count
> indicates messages skipped to avoid potential flooding. Set FINE
> logging level for stacktrace.
> Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport
> decreasingWithTrace(51)
> WARNING: Message count: 8; Feature
> org.apache.ctakes.typesystem.type.textsem.Predicate:relations is
> marked multipleReferencesAllowed=false, but it has multiple
> references.  These will be serialized in duplicate. Message count
> indicates messages skipped to avoid potential flooding. Set FINE
> logging level for stacktrace.
> Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport
> decreasingWithTrace(51)
> WARNING: Message count: 16; Feature
> org.apache.ctakes.typesystem.type.textsem.Predicate:relations is
> marked multipleReferencesAllowed=false, but it has multiple
> references.  These will be serialized in duplicate. Message count
> indicates messages skipped to avoid potential flooding. Set FINE
> logging level for stacktrace.
> Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport
> decreasingWithTrace(51)
> WARNING: Message count: 32; Feature
> org.apache.ctakes.typesystem.type.textsem.Predicate:relations is
> marked multipleReferencesAllowed=false, but it has multiple
> references.  These will be serialized in duplicate. Message count
> indicates messages skipped to avoid potential flooding. Set FINE
> logging level for stacktrace.
> 22 Sep 2017 14:00:03  INFO FileTreeXmiWriter - Finished Writing
>
>
> Sean,  we have also made some code changes in MeasurementFSM.java to
> identify certain measurements like '20 mg/m2' which was not identified
> out of the box.  Should we send the code changes to you so that you
> can consider the same to be productized ? Please advise.
>
> Regards,
> Gandhi
>
>
> -----Original Message-----
> From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
> Sent: Friday, September 22, 2017 6:54 PM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL] [SUSPICIOUS]
>
> Hi Gandhi,
>
> You don't need to add BackwardsTimeAnnotator to your piper.  It is
> added by the TemporalSubPipe.piper.  The  error that you are seeing
> regarding training is very strange, but you can try adding this line
> to the top of the file:
> set isTraining=false
>
> Can you run a sample file with your piper and send me the log
> statements?  It might help me figure out what is going on.
>
> >
> > is there any doc or guide on how to start writing our own annotator.
> There are two example annotators in the ctakes-examples project under
> the ae/ directory.  You can look at those, but I recommend that you
> look at some information on Uimafit, which can be used to create new
> annotators:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__uima.apache.org_
> d_uimafit-
> 2D2.1.0_tools.uimafit.book.pdf&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW1
> 4JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=OlZ5
> SUTgU94HjHE8vZDkXv8hjaaa9qEpAlfZjU52Ymk&s=0rIPMY5osSxL4J9gMymmv0bHsBX
> imd0yb1FmUp4uT-A&e=
> An introduction to creating Analysis Engines (Annotators) is on page
> 5.
>
> Coding style is individualistic, but below is a rubberstamp that I use
> to get started:
>
> import org.apache.ctakes.core.pipeline.PipeBitInfo;
> import org.apache.log4j.Logger;
> import org.apache.uima.UimaContext;
> import
> org.apache.uima.analysis_engine.AnalysisEngineProcessException;
> import org.apache.uima.fit.component.JCasAnnotator_ImplBase;
> import org.apache.uima.jcas.JCas;
> import org.apache.uima.resource.ResourceInitializationException;
>
> /**
>  * @author SPF , chip-nlp
>  * @version %I%
>  * @since 9/22/2017
>  */
> @PipeBitInfo(
>       name = "Template",
>       description = "For Example.", role = PipeBitInfo.Role.ANNOTATOR
> )
> final public class Template extends JCasAnnotator_ImplBase {
>
>    static private final Logger LOGGER = Logger.getLogger( "Template"
> );
>
>    /**
>     * {@inheritDoc}
>     */
>    @Override
>    public void initialize( final UimaContext context ) throws
> ResourceInitializationException {
>       // Always call the super first
>       super.initialize( context );
>       // place AE initialization code here
>    }
>
>    /**
>     * {@inheritDoc}
>     */
>    @Override
>    public void process( final JCas jCas ) throws
> AnalysisEngineProcessException {
>       LOGGER.info( "Processing ..." );
>       // Place AE processing code here
>       LOGGER.info( "Finished." );
>    }
> }
>
>
>
> If you use IntelliJ as your ide you can create a file template with
> these parameters:
>
> #if (${PACKAGE_NAME} && ${PACKAGE_NAME} != "")package
> ${PACKAGE_NAME};#end
>
> import org.apache.ctakes.core.pipeline.PipeBitInfo;
> import org.apache.log4j.Logger;
> import org.apache.uima.UimaContext;
> import
> org.apache.uima.analysis_engine.AnalysisEngineProcessException;
> import org.apache.uima.fit.component.JCasAnnotator_ImplBase;
> import org.apache.uima.jcas.JCas;
> import org.apache.uima.resource.ResourceInitializationException;
>
> #parse("File Header.java")
> @PipeBitInfo(
>       name = "${NAME}",
>       #if ( ${PROJECT_NAME} != "")description = "For
> ${PROJECT_NAME}.",#end
>       role = PipeBitInfo.Role.ANNOTATOR
> )
> final public class ${NAME} extends JCasAnnotator_ImplBase {
>
>    static private final Logger LOGGER = Logger.getLogger( "${NAME}"
> );
>
>    /**
>     * {@inheritDoc}
>     */
>    @Override
>    public void initialize( final UimaContext context ) throws
> ResourceInitializationException {
>       // Always call the super first
>       super.initialize( context );
>       // place AE initialization code here
>    }
>
>    /**
>     * {@inheritDoc}
>     */
>    @Override
>    public void process( final JCas jCas ) throws
> AnalysisEngineProcessException {
>       LOGGER.info( "Processing ..." );
>       // Place AE processing code here
>       LOGGER.info( "Finished." );
>    }
> }
>
>
>
>
>
> -----Original Message-----
> From: Gandhi Rajan Natarajan [mailto:Gandhi.Natarajan@arisglobal.com]
> Sent: Friday, September 22, 2017 2:23 AM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL] [SUSPICIOUS]
>
> Hi Sean,
>
> Thanks again for the detailed response.
>
> I still couldn't manage to get superscript-1 co-reference in piper
> GUI.  Also I'm not able to use "BackwardsTimeAnnotator" in piper GUI
> as it gives me the below error:
>
> org.apache.uima.resource.ResourceInitializationException:
> Initialization of annotator class
> "org.apache.ctakes.temporal.ae.BackwardsTimeAnnotator"
> failed.  (Descriptor: <unknown>)
>         at
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.ini
> tializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:271)
>         at
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.ini
> tialize(PrimitiveAnalysisEngine_impl.java:170)
> Caused by: java.lang.IllegalArgumentException: Please specify
> PARAM_IS_TRAINING - unable to infer it from context
>         at
> org.cleartk.ml.CleartkAnnotator.initialize(CleartkAnnotator.java:109)
>
> Somewhere in old mails it's mentioned that it's because of missing
> dependencies so I tried adding ClearTkAnnotator with no luck yet. My
> piper file is as follows:
>
> load AdvancedTokenizerPipeline.piper
> add ContextDependentTokenizerAnnotator
> add POSTagger
> load ChunkerSubPipe.piper
> load DictionarySubPipe.piper
> add org.apache.ctakes.drugner.ae.DrugMentionAnnotator
> load AttributeCleartkSubPipe.piper
> load RelationSubPipe.piper
> load TemporalSubPipe.piper
> load CorefSubPipe.piper
> add org.apache.ctakes.temporal.ae.BackwardsTimeAnnotator
> add pretty.html.HtmlTextWriter
> add FileTreeXmiWriter
>
> Any suggestion on this? Also I'm using all the latest 4.0.1 cTAKES
> Jars. Regarding the identification of Names, will dig deep on what you
> have mentioned.
>
> Sorry to ask this as you already mentioned that there are no detailed
> docs for cTAKES. But is there any doc or guide on how to start writing
> our own annotator if required? It not, Is there any simple annotator
> that you would suggest us to look into to get better understanding on
> annotators for us to proceed further.  Thanks in advance.
>
> Regards,
> Gandhi
>
>
> -----Original Message-----
> From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
> Sent: Thursday, September 21, 2017 7:59 AM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL] [SUSPICIOUS]
>
> Hi Gandhi,
>
> >
> > We guess we are missing out on something as we could not find co-
> > references for "200mg". Should we add anymore piper for this?
> The piper commands that I sent has everything to obtain coreferences.
> I use it regularly - it is what I used on your example sentence to get
> the coreferences that I mentioned.
>
> >
> > Also the change mentioned in the thread ...
> That is a very old thread and I don't think that it applies to what
> you are trying to do.
>
> >
> > We also have a requirement to identify the patient names and sex
> As James said, ctakes isn't really meant to do this.  Ctakes is
> catered toward extracting clinical data, and to this point names have
> not fallen into that category.  It is more a task for general nlp.
> There is an opennlp model that can identify names and a few others (I
> used to see names using GATE).  ctakes has wrapped opennlp for other
> tasks and you should be able to do the same to adapt an engine for
> names into ctakes.
>
> >
> > cTAKES is unable to identify the dates like 20Aug02 or 20/Aug/02 or
> > 06
> > / 07 / 02 or 27Aug2002
> As Chen mentioned, the BackwardTimeAnnotator module uses an ML model
> trained on gold data.  It isn't perfect.  You can add another time
> annotator on top of this to get some of the more simply formatted date
> mentions - there are a lot of them out there.  Personally I have used
> jchronic as it can be easily tweaked to recognize medically- relevant
> temporal expressions relating to surgery, pharmacology, etc.
>
> Sean
>
>
> -----Original Message-----
> From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
> Sent: Wednesday, September 20, 2017 8:50 AM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL] [SUSPICIOUS]
>
> Hi Gandhi,
>
> I don't have time to go through all of this right now, but I will try
> to get to it soon.
>
> Make sure that you are running the latest version in trunk.
>
> Sean
>
> -----Original Message-----
> From: Gandhi Rajan Natarajan [mailto:Gandhi.Natarajan@arisglobal.com]
> Sent: Wednesday, September 20, 2017 7:03 AM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL]
>
> Hi, Could someone help me out on the below queries please?
>
> Regards,
> Gandhi
>
> -----Original Message-----
> From: Gandhi Rajan Natarajan [mailto:Gandhi.Natarajan@arisglobal.com]
> Sent: Tuesday, September 19, 2017 8:51 PM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL]
>
> Hi Sean,
>
> Thanks again for the detailed and prompt response. We were able to run
> the piper GUI as per your advice. But in the output (The patient
> started study treatment of Thalomid 200mg ( days 1 - 21 ) , and
> Epirubicin ,20 mg / m2 ( days 1 , 8 , and 15 ) on 06 / 07 / 02 for the
> treatment of hepatocellular carcinoma.), we were not able to find
> superscript-1 as you mentioned earlier but could find superscript-2,
> 3 etc.  We guess we are missing out on something as we could not find
> co-references for "200mg". Should we add anymore piper for this?
>
> Also the change mentioned in the thread - https://urldefense.proofpoi
> nt.com/v2/url?u=http-3A__mail-2Darchives.apache.org_mod-
> 5Fmbox_ctakes-2Duser_201403.mbox_-253CCAL6WimrJ-5Fmm1-
> 2BXyggBZv62diYuWP0ScA9VEV8mNHGWe4hSNHQg-40mail.gmail.com-
> 253E&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67Gvl
> GZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=JoUDRZHu91gGMslwknPzTQC_UG2LEB
> LyOfXR3ikwOL0&s=GzhvIkBu4cgyzYN9n6VLe2rz4sJhJzMxDcWyB0BkqAc&e=  is
> required for the drug-ner module to identify drug-ner annotations.
>
> 1) We also have a requirement to identify the patient names and sex
> available in narrative texts. Please let us know how to achieve the
> same as its not identifying the proper nouns and the relationship with
> the patient?
> Eg. "This male patient named Tom Hardy aged 35 years is participating
> in a Non-IND study"
>
> 2) cTAKES is unable to identify the dates like 20Aug02 or 20/Aug/02 or
> 06 / 07 / 02 or 27Aug2002 as in the below example. Please let us know
> how to enhance the system to identify such date patterns.
> E.g " On 20Aug02, the investigator noted that this patient was
> suffering worsening fatigue and got tired getting out of his chair"
>
> Regards,
> Gandhi
>
>
> -----Original Message-----
> From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
> Sent: Monday, September 18, 2017 10:02 PM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL]
>
> Hi Gandhi,
>
> >
> > So in this case will be able to see drug attributes in the output
> > XML?
> As long as you have the DrugMentionAnnotator in your pipeline you
> should be able to find drug attributes in the xml output file.
>
> >
> > we also saw some code changes needs to be done to use drug-ner
> > module. Is it still valid?
> As far as I know there aren't any necessary code changes to get drug
> ner running.  However, I do not normally use drugner so I can't say
> for certain.
>
> >
> > Also you mentioned that the drun-ner module is out of date
> It can still be used and will produce annotations.  All that I meant
> was that there may not be many people out there using it.  It is not
> part of the default pipeline.
>
>   > You also mentioned that when you run the sentence, the date was
> identified. Where and how exactly did you ran it so that we can check
> the same?
> I run the following in a piper file because I am interested in a lot
> of modules (I added drugner just for you):
>
> // Advanced Tokenization: Regex sectionization, BIO Sentence Detector
> (lumper), Paragraphs, Lists load AdvancedTokenizerPipeline.piper add
> ContextDependentTokenizerAnnotator
> add POSTagger
> // Chunkers
> load ChunkerSubPipe.piper
> // Default fast dictionary lookup
> load DictionarySubPipe.piper
> add org.apache.ctakes.drugner.ae.DrugMentionAnnotator
> // Cleartk Entity Attributes
> load AttributeCleartkSubPipe.piper
> // Relations
> load RelationSubPipe.piper
> // Temporal
> load TemporalSubPipe.piper
> // Coreferences
> load CorefSubPipe.piper
> // Html output
> add pretty.html.HtmlTextWriter
>
> For information on piper files, see https://urldefense.proofpoint.com
> /v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Piper-
> 2BFiles&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67
> GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=JoUDRZHu91gGMslwknPzTQC_UG2
> LEBLyOfXR3ikwOL0&s=9ueuHYwEywok8byBXEkVjmTWiChmaIY3ryB4Pi6ajRo&e=
> I run it in my IDE with:
> org.apache.ctakes.core.pipeline.PiperFileRunner -Xmx3G -p
> <FileAsAbove>.piper -i org/apache/ctakes/examples/notes -o <OutputDir>
> --user <MyUmlsUser> --pass <MyUmlsPass> You can run it by command line
> by substituting "org.apache.ctakes.core.pipeline.PiperFileRunner
> -Xmx3G" with "bin/runPiperFile".
> You can also run it through a ctakes 4.01 (trunk) gui.  See https://u
> rldefense.proofpoint.com/v2/url?u=https-
> 3A__cwiki.apache.org_confluence_display_CTAKES_Piper-2BFile-
> 2BSubmitter-
> 2BGUI&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67Gv
> lGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=JoUDRZHu91gGMslwknPzTQC_UG2LE
> BLyOfXR3ikwOL0&s=VWIrXrfA2dZ8KHOdoizJo-nTx7nPSy4GDOZ7IxQteIQ&e=
>
> >
> > I'm not able to see any clickable option in HTML output
> You must have the HtmlTextWriter at the end of your pipeline to
> produce html files.  To keep the xml file output, place "add
> FileTreeXmiWriter" at the end of the piper.
>
> >
> > Apologizes for too many
> No worries, we are happy to have your interest!
>
> Sean
>
>
> -----Original Message-----
> From: Gandhi Rajan Natarajan [mailto:Gandhi.Natarajan@arisglobal.com]
> Sent: Saturday, September 16, 2017 7:01 AM
> To: dev@ctakes.apache.org
> Subject: RE: Enabling drugner pipeline and identifying dates
> [EXTERNAL]
>
> Hi Sean,
>
> Thanks again for the prompt response. Appreciate your input on adding
> DrugMentionAnnotator. Actually, we are relying on pretty printer
> output just to understand the analysis. Our logic to extract disorders
> and findings are based on the XML file generated by https:/
> /urldefense.proofpoint.com/v2/url?u=https-
> 3A__github.com_healthnlp_examples_blob_master_ctakes-2Dtemporal-
> 2Ddemo_src_main_java_org_apache_ctakes_web_client_servlet_DemoServlet
> .java&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67Gv
> lGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=_MJKBj93YJdd5aa84dBvqtg6o-
> BKBn7UcbfF660CEBI&s=g8UzBHRoOyn1hoRABKSC6EtPMvwOSSggviRmWCHKti4&e=
> So in this case will be able to see drug attributes in the output XML?
>
> In one of the old post (https://urldefense.proofpoint.com/v2/url?u=ht
> tp-3A__mail-2Darchives.apache.org_mod-5Fmbox_ctakes-
> 2Duser_201403.mbox_-253CCAL6WimrJ-5Fmm1-
> 2BXyggBZv62diYuWP0ScA9VEV8mNHGWe4hSNHQg-40mail.gmail.com-
> 253E&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67Gvl
> GZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=_MJKBj93YJdd5aa84dBvqtg6o-
> BKBn7UcbfF660CEBI&s=iT_1UGR98APO80UaZsaCBHseMqF4M4PfItgokD27r5c&e=  )
> we also saw some code changes needs to be done to use drug-ner module.
> Is it still valid? Also you mentioned that the drun-ner module is out
> of date which means it cannot be used or it may not provide accurate
> analysis? Also what changes needs to be done to bring it up to date so
> that we can try the same if you can assist?
>
> You also mentioned that when you run the sentence, the date was
> identified. Where and how exactly did you ran it so that we can check
> the same? Also regarding you explanation on corefernce, I'm not able
> to see any clickable option in HTML output. So wanted to understand
> how can we run and check that too.
>
> Apologizes for too many questions as we are just a week old in NLP and
> cTAKES. Thanks in advance.
>
> Regards,
> Gandhi
>
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> or system manager by email immediately if you have received this e-
> mail by mistake and delete this e-mail from your system. If you are
> not the intended recipient you are notified that disclosing, copying,
> distributing or taking any action in reliance on the contents of this
> information is strictly prohibited and against the law.
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> or system manager by email immediately if you have received this e-
> mail by mistake and delete this e-mail from your system. If you are
> not the intended recipient you are notified that disclosing, copying,
> distributing or taking any action in reliance on the contents of this
> information is strictly prohibited and against the law.
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> or system manager by email immediately if you have received this e-
> mail by mistake and delete this e-mail from your system. If you are
> not the intended recipient you are notified that disclosing, copying,
> distributing or taking any action in reliance on the contents of this
> information is strictly prohibited and against the law.
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> or system manager by email immediately if you have received this e-
> mail by mistake and delete this e-mail from your system. If you are
> not the intended recipient you are notified that disclosing, copying,
> distributing or taking any action in reliance on the contents of this
> information is strictly prohibited and against the law.
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> or system manager by email immediately if you have received this e-
> mail by mistake and delete this e-mail from your system. If you are
> not the intended recipient you are notified that disclosing, copying,
> distributing or taking any action in reliance on the contents of this
> information is strictly prohibited and against the law.
>
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> or system manager by email immediately if you have received this e-
> mail by mistake and delete this e-mail from your system. If you are
> not the intended recipient you are notified that disclosing, copying,
> distributing or taking any action in reliance on the contents of this
> information is strictly prohibited and against the law.
>
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> or system manager by email immediately if you have received this e-
> mail by mistake and delete this e-mail from your system. If you are
> not the intended recipient you are notified that disclosing, copying,
> distributing or taking any action in reliance on the contents of this
> information is strictly prohibited and against the law.
>
This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender or system manager by email immediately if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited and against the law.