You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ctakes.apache.org by "Chen, Pei" <Pe...@childrens.harvard.edu> on 2012/11/30 19:55:20 UTC

RE: DrugNER & PAD Term Spotter (was: Regarding Assertion Tagger)

Hi Sean,
It should load the type system by name instead of path now.
The xml file should:
<import name="org.apache.ctakes.typesystem.types.TypeSystem"/>
<import name="org.apache.ctakes.drugner.types.TypeSystem"/>

Instead of:
<import location="../type_system/NERTypeSystem.xml"/>

While testing this, I also noticed that the chunker classname in DrugAggregatePlaintextProcessor.xml should be:
org.apache.ctakes.chunker.ae.PhraseTypeChunkCreator
instead of:
edu.mayo.bmi.uima.chunker.PhraseTypeChunkCreator

--Pei

From: Murphy, Sean P. [mailto:Murphy.Sean@mayo.edu]
Sent: Friday, November 30, 2012 1:13 PM
To: 'Deepal Dhariwal'; 'ctakes-dev@incubator.apache.org'
Cc: '<ct...@incubator.apache.org>'
Subject: RE: Regarding Assertion Tagger

I'm able to reproduce the problem on 3.0 in either the prebuild zip bin or the Maven built environment.  The underlying issue appears to be related to the paths not being correctly handled.   I will investigate further.

Caused by: org.apache.uima.util.InvalidXMLException: Import failed.  Could not read from URL file:/C:/tools/cTAKESv3.0_bin/desc/ctakes-drug-ner/desc/type_system/NERTypeSystem.xml. (Descriptor: file:/C:/tools/cTAKESv3.0_bin/desc/ctakes-drug-ner/desc/analysis_engine/DrugCNP2LookupWindow.xml)
                at org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl.resolveImports(TypeSystemDescription_impl.java:231)
                at org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl.resolveImports(TypeSystemDescription_impl.java:207)
                at org.apache.uima.analysis_engine.metadata.impl.AnalysisEngineMetaData_impl.resolveImports(AnalysisEngineMetaData_impl.java:87)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveImports(AnalysisEngineDescription_impl.java:741)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveDelegateAnalysisEngineImports(AnalysisEngineDescription_impl.java:827)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveImports(AnalysisEngineDescription_impl.java:733)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveDelegateAnalysisEngineImports(AnalysisEngineDescription_impl.java:827)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveDelegateAnalysisEngineImports(AnalysisEngineDescription_impl.java:765)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.getDelegateAnalysisEngineSpecifiers(AnalysisEngineDescription_impl.java:193)
                at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:147)
                ... 45 more
Caused by: java.io.FileNotFoundException: C:\tools\cTAKESv3.0_bin\desc\ctakes-drug-ner\desc\type_system\NERTypeSystem.xml (The system cannot find the path specified)
                at java.io.FileInputStream.open(Native Method)
                at java.io.FileInputStream.<init>(Unknown Source)
                at java.io.FileInputStream.<init>(Unknown Source)
                at sun.net.www.protocol.file.FileURLConnection.connect(Unknown Source)
                at sun.net.www.protocol.file.FileURLConnection.getInputStream(Unknown Source)
                at org.apache.uima.util.XMLInputSource.<init>(XMLInputSource.java:120)
                at org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl.resolveImport(TypeSystemDescription_impl.java:263)
                at org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl.resolveImports(TypeSystemDescription_impl.java:229)
                ... 54 more
                Thanks,
                                ~Sean

From: Murphy, Sean P.
Sent: Friday, November 30, 2012 8:58 AM
To: 'Deepal Dhariwal'; ctakes-dev@incubator.apache.org
Cc: <ct...@incubator.apache.org>
Subject: RE: Regarding Assertion Tagger

Deepal,
               Thanks for verifying #1.   It took a bit longer to setup my environment on a test machine to verify, but please bear with me while I run through a regime of tests regarding these pipelines.

#2. You may need to increase the pool size used by the VM arguments for your java environment.  I believe the default is " -Xms1024M -Xmx2048M".   If your system has the resources you may want to increase these by 1GM and retest.  I do not believe this will improve the time to run, however.    Perhaps someone else has some suggestions regarding this aspect(?).

#3.  I will have to defer this question to the rest of the team.


From: Deepal Dhariwal [mailto:deepaldhariwal@gmail.com]
Sent: Thursday, November 29, 2012 8:47 PM
To: ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
Cc: Murphy, Sean P.; <ct...@incubator.apache.org>>
Subject: Re: Regarding Assertion Tagger

Hello all,

1. I downloaded ctakes 3.0 and was trying the PAD term Spotter and Drug NER lookup annotator, but I am getting Resource Initialization Exception.I have valid UMLS license and I have added username , password in UMLS Lookup Annotator.   I have been following the thread on PAD Term Spotter bug in ctakes 2.5 and I wanted to know whether it has been resolved in ctakes 3.0
2. Further my input data set is 4 MB. When I run Collection Processing Engine on the data set I get java.lang.outofmemory java heap space exception. Is there any way I could resolve this error and also reduce the time taken to execute on such huge data set.
3. Lastly as part of my thesis I am working on extracting cardio vascular terms from medical text using ctakes and umls vocabulary. I want to map these terms to existing medical owl ontologies for example, UMLS Semantic Network. I wanted to know whether ctakes community is thinking including medical ontology feature in ctakes.

Thanks
Deepal Dhariwal


On Mon, Nov 26, 2012 at 12:25 PM, Chen, Pei <Pe...@childrens.harvard.edu>> wrote:
Thanks Sean,
If the issue was just a descriptor path issue, then it was probably already fixed in 3.0 as part of the ASF move.
We can just verify and test it?

--Pei

> -----Original Message-----
> From: Murphy, Sean P. [mailto:Murphy.Sean@mayo.edu<ma...@mayo.edu>]
> Sent: Monday, November 26, 2012 12:21 PM
> To: Chen, Pei
> Cc: ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>; <ct...@incubator.apache.org>>
> Subject: RE: Regarding Assertion Tagger
>
> Hello Pei,
>       I have created a bug for the 3.0 branch as well.   However, since the
> problem is related to the relative path structures being incorrectly migrated
> to the updated format,  I am not sure if the fix should be made to the
> sourceforge 2.5 version only or all releases.    The maven driven build changes
> appear to be consolidating some of these issues, but won't be in place until
> the 3.0 build has finalized.    If so, and please correct me if I'm wrong, then:
>       1) There is no need to fix at 2.6. and
>       2) The fix checked in at 3.0 would be the based on the old directory
> structure.
>       Thanks,
>               ~Sean
>
> -----Original Message-----
> From: ctakes-user-return-31-
> Murphy.Sean=mayo.edu@incubator.apache.org<ma...@incubator.apache.org> [mailto:ctakes-user-return-<mailto:ctakes-user-return->
> 31-Murphy.Sean=mayo.edu@incubator.apache.org<ma...@incubator.apache.org>] On Behalf Of Chen, Pei
> Sent: Thursday, November 15, 2012 3:45 PM
> To: <ct...@incubator.apache.org>>
> Cc: <ct...@incubator.apache.org>>; ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
> Subject: Re: Regarding Assertion Tagger
>
> There's a 3.0.0 branch.  The release will be made from there.  So we should
> make the fixes in both trunk and 3.0.0.
>
>
> On Nov 15, 2012, at 10:32 PM, "Murphy, Sean P."
> <Mu...@mayo.edu>> wrote:
>
> > Hello Pei,
> >    The issue is at 2.5.   When is the 3.0 release freeze?   I will try to propagate
> the fix forward.
> >
> > -----Original Message-----
> > From: ctakes-user-return-29-
> Murphy.Sean=mayo.edu@incubator.apache.org<ma...@incubator.apache.org>
> > [mailto:ctakes-user-return-29-<mailto:ctakes-user-return-29->
> Murphy.Sean=mayo.edu@incubator.apache.or<ma...@incubator.apache.or>
> > g] On Behalf Of Chen, Pei
> > Sent: Thursday, November 15, 2012 3:15 PM
> > To: <ct...@incubator.apache.org>>
> > Cc: ctakes-user@incubator.apache.org<ma...@incubator.apache.org>; ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
> > Subject: Re: Regarding Assertion Tagger
> >
> > Hi Sean,
> > What was the issue in 2.5?  Just want to make sure this is also fixed in the
> upcoming 3.0 release coming out of ASF as well... Jira#?
> >
> >
> >
> > On Nov 15, 2012, at 7:59 PM, "Murphy, Sean P."
> <Mu...@mayo.edu>> wrote:
> >
> >> I was able to see an issue with the 'PAD term spotter' which will most
> likely be related to the problem you're seeing with the smoking status as
> well.    The problem seems to have  stemmed from the reorganization of the
> path structures with the latest cTAKES  release.  Due to time and resource
> constraints we were not able to test each project independently.
> >>
> >> I will open a bug report against these problems and provide a fix as soon
> as possible.  I will keep you posted, but I hope to have this resolved in a few
> days.
> >>   Thanks,
> >>       ~Sean
> >>
> >> -----Original Message-----
> >> From: ctakes-user-return-26-
> Murphy.Sean=mayo.edu@incubator.apache.org<ma...@incubator.apache.org>
> >> [mailto:ctakes-user-return-26-<mailto:ctakes-user-return-26->
> Murphy.Sean=mayo.edu@incubator.apache.o<ma...@incubator.apache.o>
> >> r
> >> g] On Behalf Of Coarr, Matt
> >> Sent: Thursday, November 15, 2012 12:22 PM
> >> To: ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
> >> Cc: ctakes-user@incubator.apache.org<ma...@incubator.apache.org>
> >> Subject: Re: Regarding Assertion Tagger
> >>
> >> You were looking in the right place Deepal! The "cTAKES 2.5 Component
> Use Guide" (the link at the bottom of your email) has a link to more
> information about the assertion module.
> >>
> >> Assertion module info:
> >>
> >> https://wiki.nci.nih.gov/display/VKC/cTAKES+2.5+-+Assertion
> >>
> >> I'm not familiar with the peripheral artery disease spotter or the simulated
> prod smoking tae.  So I'll let someone else chime in there.
> >>
> >> Matt
> >>
> >>
> >> On 2012-11-15 13:11 , "Deepal Dhariwal" <de...@gmail.com>>
> wrote:
> >>
> >>> Hello Matt,
> >>>
> >>> Thanks for your reply. I am using cTAKES-2.5.0 Binary Version which
> >>> I have downloaded from
> >>> https://wiki.nci.nih.gov/display/VKC/cTAKES+2.5+User+Install+Instruc
> >>> t
> >>> io
> >>> ns
> >>> .
> >>> I have gone through the cTAKES documentation however no where was
> it
> >>> mentioned that polarity / uncertainty properties are on Entity Mention.
> >>> In order to avoid sending repeated mails to the mailing list could
> >>> you tell me if there is some other documentation as well ? I am
> >>> trying to use Peripheral Artery Disease Spotter , however it returns
> >>> only document annotation. Further even the SimulatedProdSmokingTAE
> >>> Annotator returns smoking status 'unknown' for every input. Is there
> >>> some order in which these annotator need to be executed (Reference :
> >>>
> https://wiki.nci.nih.gov/display/VKC/cTAKES+2.5+Component+Use+Guide
> >>> )
> >>>
> >>> Thanks for clarifying the user list email id.
> >>>
> >>> Regards
> >>> Deepal Dhariwal
> >>>
> >>> On Thu, Nov 15, 2012 at 12:48 PM, Coarr, Matt <mc...@mitre.org>>
> wrote:
> >>>
> >>>> FYI, the user list is ctakes-user (singular).  I've corrected the CC.
> >>>>
> >>>> The polarity/conditional/uncertainty properties are on
> >>>> EntityMention and EventMention.
> >>>>
> >>>> Are you using a current development copy of ctakes (from apache svn
> >>>> or from 3.0 RC2)?
> >>>>
> >>>> If not, what version of ctakes are you using?  Version number?
> >>>> Binary, source zip, or source from svn?
> >>>>
> >>>> Matt
> >>>>
> >>>>
> >>


RE: DrugNER & PAD Term Spotter (was: Regarding Assertion Tagger)

Posted by "Chen, Pei" <Pe...@childrens.harvard.edu>.
Thanks Sean.
That would be awesome!


From: Murphy, Sean P. [mailto:Murphy.Sean@mayo.edu]
Sent: Friday, November 30, 2012 3:11 PM
To: ctakes-user@incubator.apache.org; 'Deepal Dhariwal'; 'ctakes-dev@incubator.apache.org'
Subject: RE: DrugNER & PAD Term Spotter (was: Regarding Assertion Tagger)

Thanks Pei,
                Agreed.  I believe there was an ant build script built by our team that provided an automated means to change the locations when creating the iCTAKES build  from the separate cTAKES projects.   I wasn't sure the nature of the changes to the Maven driven builds, so that's where I misunderstood where and how the changes were to be made.
                If okay with everyone I will search all the projects using the old implementation of the import location relative paths, update and test against 3.0.
                Thanks,
                                ~Sean


From: ctakes-user-return-39-Murphy.Sean=mayo.edu@incubator.apache.org [mailto:ctakes-user-return-39-Murphy.Sean=mayo.edu@incubator.apache.org] On Behalf Of Chen, Pei
Sent: Friday, November 30, 2012 12:55 PM
To: ctakes-user@incubator.apache.org; 'Deepal Dhariwal'; 'ctakes-dev@incubator.apache.org'
Subject: RE: DrugNER & PAD Term Spotter (was: Regarding Assertion Tagger)

Hi Sean,
It should load the type system by name instead of path now.
The xml file should:
<import name="org.apache.ctakes.typesystem.types.TypeSystem"/>
<import name="org.apache.ctakes.drugner.types.TypeSystem"/>

Instead of:
<import location="../type_system/NERTypeSystem.xml"/>

While testing this, I also noticed that the chunker classname in DrugAggregatePlaintextProcessor.xml should be:
org.apache.ctakes.chunker.ae.PhraseTypeChunkCreator
instead of:
edu.mayo.bmi.uima.chunker.PhraseTypeChunkCreator

--Pei

From: Murphy, Sean P. [mailto:Murphy.Sean@mayo.edu]
Sent: Friday, November 30, 2012 1:13 PM
To: 'Deepal Dhariwal'; 'ctakes-dev@incubator.apache.org'
Cc: '<ct...@incubator.apache.org>>'
Subject: RE: Regarding Assertion Tagger

I'm able to reproduce the problem on 3.0 in either the prebuild zip bin or the Maven built environment.  The underlying issue appears to be related to the paths not being correctly handled.   I will investigate further.

Caused by: org.apache.uima.util.InvalidXMLException: Import failed.  Could not read from URL file:/C:/tools/cTAKESv3.0_bin/desc/ctakes-drug-ner/desc/type_system/NERTypeSystem.xml. (Descriptor: file:/C:/tools/cTAKESv3.0_bin/desc/ctakes-drug-ner/desc/analysis_engine/DrugCNP2LookupWindow.xml)
                at org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl.resolveImports(TypeSystemDescription_impl.java:231)
                at org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl.resolveImports(TypeSystemDescription_impl.java:207)
                at org.apache.uima.analysis_engine.metadata.impl.AnalysisEngineMetaData_impl.resolveImports(AnalysisEngineMetaData_impl.java:87)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveImports(AnalysisEngineDescription_impl.java:741)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveDelegateAnalysisEngineImports(AnalysisEngineDescription_impl.java:827)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveImports(AnalysisEngineDescription_impl.java:733)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveDelegateAnalysisEngineImports(AnalysisEngineDescription_impl.java:827)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveDelegateAnalysisEngineImports(AnalysisEngineDescription_impl.java:765)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.getDelegateAnalysisEngineSpecifiers(AnalysisEngineDescription_impl.java:193)
                at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:147)
                ... 45 more
Caused by: java.io.FileNotFoundException: C:\tools\cTAKESv3.0_bin\desc\ctakes-drug-ner\desc\type_system\NERTypeSystem.xml (The system cannot find the path specified)
                at java.io.FileInputStream.open(Native Method)
                at java.io.FileInputStream.<init>(Unknown Source)
                at java.io.FileInputStream.<init>(Unknown Source)
                at sun.net.www.protocol.file.FileURLConnection.connect(Unknown Source)
                at sun.net.www.protocol.file.FileURLConnection.getInputStream(Unknown Source)
                at org.apache.uima.util.XMLInputSource.<init>(XMLInputSource.java:120)
                at org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl.resolveImport(TypeSystemDescription_impl.java:263)
                at org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl.resolveImports(TypeSystemDescription_impl.java:229)
                ... 54 more
                Thanks,
                                ~Sean

From: Murphy, Sean P.
Sent: Friday, November 30, 2012 8:58 AM
To: 'Deepal Dhariwal'; ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
Cc: <ct...@incubator.apache.org>>
Subject: RE: Regarding Assertion Tagger

Deepal,
               Thanks for verifying #1.   It took a bit longer to setup my environment on a test machine to verify, but please bear with me while I run through a regime of tests regarding these pipelines.

#2. You may need to increase the pool size used by the VM arguments for your java environment.  I believe the default is " -Xms1024M -Xmx2048M".   If your system has the resources you may want to increase these by 1GM and retest.  I do not believe this will improve the time to run, however.    Perhaps someone else has some suggestions regarding this aspect(?).

#3.  I will have to defer this question to the rest of the team.


From: Deepal Dhariwal [mailto:deepaldhariwal@gmail.com]
Sent: Thursday, November 29, 2012 8:47 PM
To: ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
Cc: Murphy, Sean P.; <ct...@incubator.apache.org>>
Subject: Re: Regarding Assertion Tagger

Hello all,

1. I downloaded ctakes 3.0 and was trying the PAD term Spotter and Drug NER lookup annotator, but I am getting Resource Initialization Exception.I have valid UMLS license and I have added username , password in UMLS Lookup Annotator.   I have been following the thread on PAD Term Spotter bug in ctakes 2.5 and I wanted to know whether it has been resolved in ctakes 3.0
2. Further my input data set is 4 MB. When I run Collection Processing Engine on the data set I get java.lang.outofmemory java heap space exception. Is there any way I could resolve this error and also reduce the time taken to execute on such huge data set.
3. Lastly as part of my thesis I am working on extracting cardio vascular terms from medical text using ctakes and umls vocabulary. I want to map these terms to existing medical owl ontologies for example, UMLS Semantic Network. I wanted to know whether ctakes community is thinking including medical ontology feature in ctakes.

Thanks
Deepal Dhariwal


On Mon, Nov 26, 2012 at 12:25 PM, Chen, Pei <Pe...@childrens.harvard.edu>> wrote:
Thanks Sean,
If the issue was just a descriptor path issue, then it was probably already fixed in 3.0 as part of the ASF move.
We can just verify and test it?

--Pei

> -----Original Message-----
> From: Murphy, Sean P. [mailto:Murphy.Sean@mayo.edu<ma...@mayo.edu>]
> Sent: Monday, November 26, 2012 12:21 PM
> To: Chen, Pei
> Cc: ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>; <ct...@incubator.apache.org>>
> Subject: RE: Regarding Assertion Tagger
>
> Hello Pei,
>       I have created a bug for the 3.0 branch as well.   However, since the
> problem is related to the relative path structures being incorrectly migrated
> to the updated format,  I am not sure if the fix should be made to the
> sourceforge 2.5 version only or all releases.    The maven driven build changes
> appear to be consolidating some of these issues, but won't be in place until
> the 3.0 build has finalized.    If so, and please correct me if I'm wrong, then:
>       1) There is no need to fix at 2.6. and
>       2) The fix checked in at 3.0 would be the based on the old directory
> structure.
>       Thanks,
>               ~Sean
>
> -----Original Message-----
> From: ctakes-user-return-31-
> Murphy.Sean=mayo.edu@incubator.apache.org<ma...@incubator.apache.org> [mailto:ctakes-user-return-<mailto:ctakes-user-return->
> 31-Murphy.Sean=mayo.edu@incubator.apache.org<ma...@incubator.apache.org>] On Behalf Of Chen, Pei
> Sent: Thursday, November 15, 2012 3:45 PM
> To: <ct...@incubator.apache.org>>
> Cc: <ct...@incubator.apache.org>>; ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
> Subject: Re: Regarding Assertion Tagger
>
> There's a 3.0.0 branch.  The release will be made from there.  So we should
> make the fixes in both trunk and 3.0.0.
>
>
> On Nov 15, 2012, at 10:32 PM, "Murphy, Sean P."
> <Mu...@mayo.edu>> wrote:
>
> > Hello Pei,
> >    The issue is at 2.5.   When is the 3.0 release freeze?   I will try to propagate
> the fix forward.
> >
> > -----Original Message-----
> > From: ctakes-user-return-29-
> Murphy.Sean=mayo.edu@incubator.apache.org<ma...@incubator.apache.org>
> > [mailto:ctakes-user-return-29-<mailto:ctakes-user-return-29->
> Murphy.Sean=mayo.edu@incubator.apache.or<ma...@incubator.apache.or>
> > g] On Behalf Of Chen, Pei
> > Sent: Thursday, November 15, 2012 3:15 PM
> > To: <ct...@incubator.apache.org>>
> > Cc: ctakes-user@incubator.apache.org<ma...@incubator.apache.org>; ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
> > Subject: Re: Regarding Assertion Tagger
> >
> > Hi Sean,
> > What was the issue in 2.5?  Just want to make sure this is also fixed in the
> upcoming 3.0 release coming out of ASF as well... Jira#?
> >
> >
> >
> > On Nov 15, 2012, at 7:59 PM, "Murphy, Sean P."
> <Mu...@mayo.edu>> wrote:
> >
> >> I was able to see an issue with the 'PAD term spotter' which will most
> likely be related to the problem you're seeing with the smoking status as
> well.    The problem seems to have  stemmed from the reorganization of the
> path structures with the latest cTAKES  release.  Due to time and resource
> constraints we were not able to test each project independently.
> >>
> >> I will open a bug report against these problems and provide a fix as soon
> as possible.  I will keep you posted, but I hope to have this resolved in a few
> days.
> >>   Thanks,
> >>       ~Sean
> >>
> >> -----Original Message-----
> >> From: ctakes-user-return-26-
> Murphy.Sean=mayo.edu@incubator.apache.org<ma...@incubator.apache.org>
> >> [mailto:ctakes-user-return-26-<mailto:ctakes-user-return-26->
> Murphy.Sean=mayo.edu@incubator.apache.o<ma...@incubator.apache.o>
> >> r
> >> g] On Behalf Of Coarr, Matt
> >> Sent: Thursday, November 15, 2012 12:22 PM
> >> To: ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
> >> Cc: ctakes-user@incubator.apache.org<ma...@incubator.apache.org>
> >> Subject: Re: Regarding Assertion Tagger
> >>
> >> You were looking in the right place Deepal! The "cTAKES 2.5 Component
> Use Guide" (the link at the bottom of your email) has a link to more
> information about the assertion module.
> >>
> >> Assertion module info:
> >>
> >> https://wiki.nci.nih.gov/display/VKC/cTAKES+2.5+-+Assertion
> >>
> >> I'm not familiar with the peripheral artery disease spotter or the simulated
> prod smoking tae.  So I'll let someone else chime in there.
> >>
> >> Matt
> >>
> >>
> >> On 2012-11-15 13:11 , "Deepal Dhariwal" <de...@gmail.com>>
> wrote:
> >>
> >>> Hello Matt,
> >>>
> >>> Thanks for your reply. I am using cTAKES-2.5.0 Binary Version which
> >>> I have downloaded from
> >>> https://wiki.nci.nih.gov/display/VKC/cTAKES+2.5+User+Install+Instruc
> >>> t
> >>> io
> >>> ns
> >>> .
> >>> I have gone through the cTAKES documentation however no where was
> it
> >>> mentioned that polarity / uncertainty properties are on Entity Mention.
> >>> In order to avoid sending repeated mails to the mailing list could
> >>> you tell me if there is some other documentation as well ? I am
> >>> trying to use Peripheral Artery Disease Spotter , however it returns
> >>> only document annotation. Further even the SimulatedProdSmokingTAE
> >>> Annotator returns smoking status 'unknown' for every input. Is there
> >>> some order in which these annotator need to be executed (Reference :
> >>>
> https://wiki.nci.nih.gov/display/VKC/cTAKES+2.5+Component+Use+Guide
> >>> )
> >>>
> >>> Thanks for clarifying the user list email id.
> >>>
> >>> Regards
> >>> Deepal Dhariwal
> >>>
> >>> On Thu, Nov 15, 2012 at 12:48 PM, Coarr, Matt <mc...@mitre.org>>
> wrote:
> >>>
> >>>> FYI, the user list is ctakes-user (singular).  I've corrected the CC.
> >>>>
> >>>> The polarity/conditional/uncertainty properties are on
> >>>> EntityMention and EventMention.
> >>>>
> >>>> Are you using a current development copy of ctakes (from apache svn
> >>>> or from 3.0 RC2)?
> >>>>
> >>>> If not, what version of ctakes are you using?  Version number?
> >>>> Binary, source zip, or source from svn?
> >>>>
> >>>> Matt
> >>>>
> >>>>
> >>


RE: DrugNER & PAD Term Spotter (was: Regarding Assertion Tagger)

Posted by "Chen, Pei" <Pe...@childrens.harvard.edu>.
Thanks Sean.
That would be awesome!


From: Murphy, Sean P. [mailto:Murphy.Sean@mayo.edu]
Sent: Friday, November 30, 2012 3:11 PM
To: ctakes-user@incubator.apache.org; 'Deepal Dhariwal'; 'ctakes-dev@incubator.apache.org'
Subject: RE: DrugNER & PAD Term Spotter (was: Regarding Assertion Tagger)

Thanks Pei,
                Agreed.  I believe there was an ant build script built by our team that provided an automated means to change the locations when creating the iCTAKES build  from the separate cTAKES projects.   I wasn't sure the nature of the changes to the Maven driven builds, so that's where I misunderstood where and how the changes were to be made.
                If okay with everyone I will search all the projects using the old implementation of the import location relative paths, update and test against 3.0.
                Thanks,
                                ~Sean


From: ctakes-user-return-39-Murphy.Sean=mayo.edu@incubator.apache.org [mailto:ctakes-user-return-39-Murphy.Sean=mayo.edu@incubator.apache.org] On Behalf Of Chen, Pei
Sent: Friday, November 30, 2012 12:55 PM
To: ctakes-user@incubator.apache.org; 'Deepal Dhariwal'; 'ctakes-dev@incubator.apache.org'
Subject: RE: DrugNER & PAD Term Spotter (was: Regarding Assertion Tagger)

Hi Sean,
It should load the type system by name instead of path now.
The xml file should:
<import name="org.apache.ctakes.typesystem.types.TypeSystem"/>
<import name="org.apache.ctakes.drugner.types.TypeSystem"/>

Instead of:
<import location="../type_system/NERTypeSystem.xml"/>

While testing this, I also noticed that the chunker classname in DrugAggregatePlaintextProcessor.xml should be:
org.apache.ctakes.chunker.ae.PhraseTypeChunkCreator
instead of:
edu.mayo.bmi.uima.chunker.PhraseTypeChunkCreator

--Pei

From: Murphy, Sean P. [mailto:Murphy.Sean@mayo.edu]
Sent: Friday, November 30, 2012 1:13 PM
To: 'Deepal Dhariwal'; 'ctakes-dev@incubator.apache.org'
Cc: '<ct...@incubator.apache.org>>'
Subject: RE: Regarding Assertion Tagger

I'm able to reproduce the problem on 3.0 in either the prebuild zip bin or the Maven built environment.  The underlying issue appears to be related to the paths not being correctly handled.   I will investigate further.

Caused by: org.apache.uima.util.InvalidXMLException: Import failed.  Could not read from URL file:/C:/tools/cTAKESv3.0_bin/desc/ctakes-drug-ner/desc/type_system/NERTypeSystem.xml. (Descriptor: file:/C:/tools/cTAKESv3.0_bin/desc/ctakes-drug-ner/desc/analysis_engine/DrugCNP2LookupWindow.xml)
                at org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl.resolveImports(TypeSystemDescription_impl.java:231)
                at org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl.resolveImports(TypeSystemDescription_impl.java:207)
                at org.apache.uima.analysis_engine.metadata.impl.AnalysisEngineMetaData_impl.resolveImports(AnalysisEngineMetaData_impl.java:87)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveImports(AnalysisEngineDescription_impl.java:741)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveDelegateAnalysisEngineImports(AnalysisEngineDescription_impl.java:827)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveImports(AnalysisEngineDescription_impl.java:733)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveDelegateAnalysisEngineImports(AnalysisEngineDescription_impl.java:827)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveDelegateAnalysisEngineImports(AnalysisEngineDescription_impl.java:765)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.getDelegateAnalysisEngineSpecifiers(AnalysisEngineDescription_impl.java:193)
                at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:147)
                ... 45 more
Caused by: java.io.FileNotFoundException: C:\tools\cTAKESv3.0_bin\desc\ctakes-drug-ner\desc\type_system\NERTypeSystem.xml (The system cannot find the path specified)
                at java.io.FileInputStream.open(Native Method)
                at java.io.FileInputStream.<init>(Unknown Source)
                at java.io.FileInputStream.<init>(Unknown Source)
                at sun.net.www.protocol.file.FileURLConnection.connect(Unknown Source)
                at sun.net.www.protocol.file.FileURLConnection.getInputStream(Unknown Source)
                at org.apache.uima.util.XMLInputSource.<init>(XMLInputSource.java:120)
                at org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl.resolveImport(TypeSystemDescription_impl.java:263)
                at org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl.resolveImports(TypeSystemDescription_impl.java:229)
                ... 54 more
                Thanks,
                                ~Sean

From: Murphy, Sean P.
Sent: Friday, November 30, 2012 8:58 AM
To: 'Deepal Dhariwal'; ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
Cc: <ct...@incubator.apache.org>>
Subject: RE: Regarding Assertion Tagger

Deepal,
               Thanks for verifying #1.   It took a bit longer to setup my environment on a test machine to verify, but please bear with me while I run through a regime of tests regarding these pipelines.

#2. You may need to increase the pool size used by the VM arguments for your java environment.  I believe the default is " -Xms1024M -Xmx2048M".   If your system has the resources you may want to increase these by 1GM and retest.  I do not believe this will improve the time to run, however.    Perhaps someone else has some suggestions regarding this aspect(?).

#3.  I will have to defer this question to the rest of the team.


From: Deepal Dhariwal [mailto:deepaldhariwal@gmail.com]
Sent: Thursday, November 29, 2012 8:47 PM
To: ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
Cc: Murphy, Sean P.; <ct...@incubator.apache.org>>
Subject: Re: Regarding Assertion Tagger

Hello all,

1. I downloaded ctakes 3.0 and was trying the PAD term Spotter and Drug NER lookup annotator, but I am getting Resource Initialization Exception.I have valid UMLS license and I have added username , password in UMLS Lookup Annotator.   I have been following the thread on PAD Term Spotter bug in ctakes 2.5 and I wanted to know whether it has been resolved in ctakes 3.0
2. Further my input data set is 4 MB. When I run Collection Processing Engine on the data set I get java.lang.outofmemory java heap space exception. Is there any way I could resolve this error and also reduce the time taken to execute on such huge data set.
3. Lastly as part of my thesis I am working on extracting cardio vascular terms from medical text using ctakes and umls vocabulary. I want to map these terms to existing medical owl ontologies for example, UMLS Semantic Network. I wanted to know whether ctakes community is thinking including medical ontology feature in ctakes.

Thanks
Deepal Dhariwal


On Mon, Nov 26, 2012 at 12:25 PM, Chen, Pei <Pe...@childrens.harvard.edu>> wrote:
Thanks Sean,
If the issue was just a descriptor path issue, then it was probably already fixed in 3.0 as part of the ASF move.
We can just verify and test it?

--Pei

> -----Original Message-----
> From: Murphy, Sean P. [mailto:Murphy.Sean@mayo.edu<ma...@mayo.edu>]
> Sent: Monday, November 26, 2012 12:21 PM
> To: Chen, Pei
> Cc: ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>; <ct...@incubator.apache.org>>
> Subject: RE: Regarding Assertion Tagger
>
> Hello Pei,
>       I have created a bug for the 3.0 branch as well.   However, since the
> problem is related to the relative path structures being incorrectly migrated
> to the updated format,  I am not sure if the fix should be made to the
> sourceforge 2.5 version only or all releases.    The maven driven build changes
> appear to be consolidating some of these issues, but won't be in place until
> the 3.0 build has finalized.    If so, and please correct me if I'm wrong, then:
>       1) There is no need to fix at 2.6. and
>       2) The fix checked in at 3.0 would be the based on the old directory
> structure.
>       Thanks,
>               ~Sean
>
> -----Original Message-----
> From: ctakes-user-return-31-
> Murphy.Sean=mayo.edu@incubator.apache.org<ma...@incubator.apache.org> [mailto:ctakes-user-return-<mailto:ctakes-user-return->
> 31-Murphy.Sean=mayo.edu@incubator.apache.org<ma...@incubator.apache.org>] On Behalf Of Chen, Pei
> Sent: Thursday, November 15, 2012 3:45 PM
> To: <ct...@incubator.apache.org>>
> Cc: <ct...@incubator.apache.org>>; ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
> Subject: Re: Regarding Assertion Tagger
>
> There's a 3.0.0 branch.  The release will be made from there.  So we should
> make the fixes in both trunk and 3.0.0.
>
>
> On Nov 15, 2012, at 10:32 PM, "Murphy, Sean P."
> <Mu...@mayo.edu>> wrote:
>
> > Hello Pei,
> >    The issue is at 2.5.   When is the 3.0 release freeze?   I will try to propagate
> the fix forward.
> >
> > -----Original Message-----
> > From: ctakes-user-return-29-
> Murphy.Sean=mayo.edu@incubator.apache.org<ma...@incubator.apache.org>
> > [mailto:ctakes-user-return-29-<mailto:ctakes-user-return-29->
> Murphy.Sean=mayo.edu@incubator.apache.or<ma...@incubator.apache.or>
> > g] On Behalf Of Chen, Pei
> > Sent: Thursday, November 15, 2012 3:15 PM
> > To: <ct...@incubator.apache.org>>
> > Cc: ctakes-user@incubator.apache.org<ma...@incubator.apache.org>; ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
> > Subject: Re: Regarding Assertion Tagger
> >
> > Hi Sean,
> > What was the issue in 2.5?  Just want to make sure this is also fixed in the
> upcoming 3.0 release coming out of ASF as well... Jira#?
> >
> >
> >
> > On Nov 15, 2012, at 7:59 PM, "Murphy, Sean P."
> <Mu...@mayo.edu>> wrote:
> >
> >> I was able to see an issue with the 'PAD term spotter' which will most
> likely be related to the problem you're seeing with the smoking status as
> well.    The problem seems to have  stemmed from the reorganization of the
> path structures with the latest cTAKES  release.  Due to time and resource
> constraints we were not able to test each project independently.
> >>
> >> I will open a bug report against these problems and provide a fix as soon
> as possible.  I will keep you posted, but I hope to have this resolved in a few
> days.
> >>   Thanks,
> >>       ~Sean
> >>
> >> -----Original Message-----
> >> From: ctakes-user-return-26-
> Murphy.Sean=mayo.edu@incubator.apache.org<ma...@incubator.apache.org>
> >> [mailto:ctakes-user-return-26-<mailto:ctakes-user-return-26->
> Murphy.Sean=mayo.edu@incubator.apache.o<ma...@incubator.apache.o>
> >> r
> >> g] On Behalf Of Coarr, Matt
> >> Sent: Thursday, November 15, 2012 12:22 PM
> >> To: ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
> >> Cc: ctakes-user@incubator.apache.org<ma...@incubator.apache.org>
> >> Subject: Re: Regarding Assertion Tagger
> >>
> >> You were looking in the right place Deepal! The "cTAKES 2.5 Component
> Use Guide" (the link at the bottom of your email) has a link to more
> information about the assertion module.
> >>
> >> Assertion module info:
> >>
> >> https://wiki.nci.nih.gov/display/VKC/cTAKES+2.5+-+Assertion
> >>
> >> I'm not familiar with the peripheral artery disease spotter or the simulated
> prod smoking tae.  So I'll let someone else chime in there.
> >>
> >> Matt
> >>
> >>
> >> On 2012-11-15 13:11 , "Deepal Dhariwal" <de...@gmail.com>>
> wrote:
> >>
> >>> Hello Matt,
> >>>
> >>> Thanks for your reply. I am using cTAKES-2.5.0 Binary Version which
> >>> I have downloaded from
> >>> https://wiki.nci.nih.gov/display/VKC/cTAKES+2.5+User+Install+Instruc
> >>> t
> >>> io
> >>> ns
> >>> .
> >>> I have gone through the cTAKES documentation however no where was
> it
> >>> mentioned that polarity / uncertainty properties are on Entity Mention.
> >>> In order to avoid sending repeated mails to the mailing list could
> >>> you tell me if there is some other documentation as well ? I am
> >>> trying to use Peripheral Artery Disease Spotter , however it returns
> >>> only document annotation. Further even the SimulatedProdSmokingTAE
> >>> Annotator returns smoking status 'unknown' for every input. Is there
> >>> some order in which these annotator need to be executed (Reference :
> >>>
> https://wiki.nci.nih.gov/display/VKC/cTAKES+2.5+Component+Use+Guide
> >>> )
> >>>
> >>> Thanks for clarifying the user list email id.
> >>>
> >>> Regards
> >>> Deepal Dhariwal
> >>>
> >>> On Thu, Nov 15, 2012 at 12:48 PM, Coarr, Matt <mc...@mitre.org>>
> wrote:
> >>>
> >>>> FYI, the user list is ctakes-user (singular).  I've corrected the CC.
> >>>>
> >>>> The polarity/conditional/uncertainty properties are on
> >>>> EntityMention and EventMention.
> >>>>
> >>>> Are you using a current development copy of ctakes (from apache svn
> >>>> or from 3.0 RC2)?
> >>>>
> >>>> If not, what version of ctakes are you using?  Version number?
> >>>> Binary, source zip, or source from svn?
> >>>>
> >>>> Matt
> >>>>
> >>>>
> >>


RE: DrugNER & PAD Term Spotter (was: Regarding Assertion Tagger)

Posted by "Murphy, Sean P." <Mu...@mayo.edu>.
Thanks Pei,
                Agreed.  I believe there was an ant build script built by our team that provided an automated means to change the locations when creating the iCTAKES build  from the separate cTAKES projects.   I wasn't sure the nature of the changes to the Maven driven builds, so that's where I misunderstood where and how the changes were to be made.
                If okay with everyone I will search all the projects using the old implementation of the import location relative paths, update and test against 3.0.
                Thanks,
                                ~Sean


From: ctakes-user-return-39-Murphy.Sean=mayo.edu@incubator.apache.org [mailto:ctakes-user-return-39-Murphy.Sean=mayo.edu@incubator.apache.org] On Behalf Of Chen, Pei
Sent: Friday, November 30, 2012 12:55 PM
To: ctakes-user@incubator.apache.org; 'Deepal Dhariwal'; 'ctakes-dev@incubator.apache.org'
Subject: RE: DrugNER & PAD Term Spotter (was: Regarding Assertion Tagger)

Hi Sean,
It should load the type system by name instead of path now.
The xml file should:
<import name="org.apache.ctakes.typesystem.types.TypeSystem"/>
<import name="org.apache.ctakes.drugner.types.TypeSystem"/>

Instead of:
<import location="../type_system/NERTypeSystem.xml"/>

While testing this, I also noticed that the chunker classname in DrugAggregatePlaintextProcessor.xml should be:
org.apache.ctakes.chunker.ae.PhraseTypeChunkCreator
instead of:
edu.mayo.bmi.uima.chunker.PhraseTypeChunkCreator

--Pei

From: Murphy, Sean P. [mailto:Murphy.Sean@mayo.edu]
Sent: Friday, November 30, 2012 1:13 PM
To: 'Deepal Dhariwal'; 'ctakes-dev@incubator.apache.org'
Cc: '<ct...@incubator.apache.org>>'
Subject: RE: Regarding Assertion Tagger

I'm able to reproduce the problem on 3.0 in either the prebuild zip bin or the Maven built environment.  The underlying issue appears to be related to the paths not being correctly handled.   I will investigate further.

Caused by: org.apache.uima.util.InvalidXMLException: Import failed.  Could not read from URL file:/C:/tools/cTAKESv3.0_bin/desc/ctakes-drug-ner/desc/type_system/NERTypeSystem.xml. (Descriptor: file:/C:/tools/cTAKESv3.0_bin/desc/ctakes-drug-ner/desc/analysis_engine/DrugCNP2LookupWindow.xml)
                at org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl.resolveImports(TypeSystemDescription_impl.java:231)
                at org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl.resolveImports(TypeSystemDescription_impl.java:207)
                at org.apache.uima.analysis_engine.metadata.impl.AnalysisEngineMetaData_impl.resolveImports(AnalysisEngineMetaData_impl.java:87)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveImports(AnalysisEngineDescription_impl.java:741)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveDelegateAnalysisEngineImports(AnalysisEngineDescription_impl.java:827)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveImports(AnalysisEngineDescription_impl.java:733)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveDelegateAnalysisEngineImports(AnalysisEngineDescription_impl.java:827)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveDelegateAnalysisEngineImports(AnalysisEngineDescription_impl.java:765)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.getDelegateAnalysisEngineSpecifiers(AnalysisEngineDescription_impl.java:193)
                at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:147)
                ... 45 more
Caused by: java.io.FileNotFoundException: C:\tools\cTAKESv3.0_bin\desc\ctakes-drug-ner\desc\type_system\NERTypeSystem.xml (The system cannot find the path specified)
                at java.io.FileInputStream.open(Native Method)
                at java.io.FileInputStream.<init>(Unknown Source)
                at java.io.FileInputStream.<init>(Unknown Source)
                at sun.net.www.protocol.file.FileURLConnection.connect(Unknown Source)
                at sun.net.www.protocol.file.FileURLConnection.getInputStream(Unknown Source)
                at org.apache.uima.util.XMLInputSource.<init>(XMLInputSource.java:120)
                at org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl.resolveImport(TypeSystemDescription_impl.java:263)
                at org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl.resolveImports(TypeSystemDescription_impl.java:229)
                ... 54 more
                Thanks,
                                ~Sean

From: Murphy, Sean P.
Sent: Friday, November 30, 2012 8:58 AM
To: 'Deepal Dhariwal'; ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
Cc: <ct...@incubator.apache.org>>
Subject: RE: Regarding Assertion Tagger

Deepal,
               Thanks for verifying #1.   It took a bit longer to setup my environment on a test machine to verify, but please bear with me while I run through a regime of tests regarding these pipelines.

#2. You may need to increase the pool size used by the VM arguments for your java environment.  I believe the default is " -Xms1024M -Xmx2048M".   If your system has the resources you may want to increase these by 1GM and retest.  I do not believe this will improve the time to run, however.    Perhaps someone else has some suggestions regarding this aspect(?).

#3.  I will have to defer this question to the rest of the team.


From: Deepal Dhariwal [mailto:deepaldhariwal@gmail.com]
Sent: Thursday, November 29, 2012 8:47 PM
To: ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
Cc: Murphy, Sean P.; <ct...@incubator.apache.org>>
Subject: Re: Regarding Assertion Tagger

Hello all,

1. I downloaded ctakes 3.0 and was trying the PAD term Spotter and Drug NER lookup annotator, but I am getting Resource Initialization Exception.I have valid UMLS license and I have added username , password in UMLS Lookup Annotator.   I have been following the thread on PAD Term Spotter bug in ctakes 2.5 and I wanted to know whether it has been resolved in ctakes 3.0
2. Further my input data set is 4 MB. When I run Collection Processing Engine on the data set I get java.lang.outofmemory java heap space exception. Is there any way I could resolve this error and also reduce the time taken to execute on such huge data set.
3. Lastly as part of my thesis I am working on extracting cardio vascular terms from medical text using ctakes and umls vocabulary. I want to map these terms to existing medical owl ontologies for example, UMLS Semantic Network. I wanted to know whether ctakes community is thinking including medical ontology feature in ctakes.

Thanks
Deepal Dhariwal


On Mon, Nov 26, 2012 at 12:25 PM, Chen, Pei <Pe...@childrens.harvard.edu>> wrote:
Thanks Sean,
If the issue was just a descriptor path issue, then it was probably already fixed in 3.0 as part of the ASF move.
We can just verify and test it?

--Pei

> -----Original Message-----
> From: Murphy, Sean P. [mailto:Murphy.Sean@mayo.edu<ma...@mayo.edu>]
> Sent: Monday, November 26, 2012 12:21 PM
> To: Chen, Pei
> Cc: ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>; <ct...@incubator.apache.org>>
> Subject: RE: Regarding Assertion Tagger
>
> Hello Pei,
>       I have created a bug for the 3.0 branch as well.   However, since the
> problem is related to the relative path structures being incorrectly migrated
> to the updated format,  I am not sure if the fix should be made to the
> sourceforge 2.5 version only or all releases.    The maven driven build changes
> appear to be consolidating some of these issues, but won't be in place until
> the 3.0 build has finalized.    If so, and please correct me if I'm wrong, then:
>       1) There is no need to fix at 2.6. and
>       2) The fix checked in at 3.0 would be the based on the old directory
> structure.
>       Thanks,
>               ~Sean
>
> -----Original Message-----
> From: ctakes-user-return-31-
> Murphy.Sean=mayo.edu@incubator.apache.org<ma...@incubator.apache.org> [mailto:ctakes-user-return-<mailto:ctakes-user-return->
> 31-Murphy.Sean=mayo.edu@incubator.apache.org<ma...@incubator.apache.org>] On Behalf Of Chen, Pei
> Sent: Thursday, November 15, 2012 3:45 PM
> To: <ct...@incubator.apache.org>>
> Cc: <ct...@incubator.apache.org>>; ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
> Subject: Re: Regarding Assertion Tagger
>
> There's a 3.0.0 branch.  The release will be made from there.  So we should
> make the fixes in both trunk and 3.0.0.
>
>
> On Nov 15, 2012, at 10:32 PM, "Murphy, Sean P."
> <Mu...@mayo.edu>> wrote:
>
> > Hello Pei,
> >    The issue is at 2.5.   When is the 3.0 release freeze?   I will try to propagate
> the fix forward.
> >
> > -----Original Message-----
> > From: ctakes-user-return-29-
> Murphy.Sean=mayo.edu@incubator.apache.org<ma...@incubator.apache.org>
> > [mailto:ctakes-user-return-29-<mailto:ctakes-user-return-29->
> Murphy.Sean=mayo.edu@incubator.apache.or<ma...@incubator.apache.or>
> > g] On Behalf Of Chen, Pei
> > Sent: Thursday, November 15, 2012 3:15 PM
> > To: <ct...@incubator.apache.org>>
> > Cc: ctakes-user@incubator.apache.org<ma...@incubator.apache.org>; ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
> > Subject: Re: Regarding Assertion Tagger
> >
> > Hi Sean,
> > What was the issue in 2.5?  Just want to make sure this is also fixed in the
> upcoming 3.0 release coming out of ASF as well... Jira#?
> >
> >
> >
> > On Nov 15, 2012, at 7:59 PM, "Murphy, Sean P."
> <Mu...@mayo.edu>> wrote:
> >
> >> I was able to see an issue with the 'PAD term spotter' which will most
> likely be related to the problem you're seeing with the smoking status as
> well.    The problem seems to have  stemmed from the reorganization of the
> path structures with the latest cTAKES  release.  Due to time and resource
> constraints we were not able to test each project independently.
> >>
> >> I will open a bug report against these problems and provide a fix as soon
> as possible.  I will keep you posted, but I hope to have this resolved in a few
> days.
> >>   Thanks,
> >>       ~Sean
> >>
> >> -----Original Message-----
> >> From: ctakes-user-return-26-
> Murphy.Sean=mayo.edu@incubator.apache.org<ma...@incubator.apache.org>
> >> [mailto:ctakes-user-return-26-<mailto:ctakes-user-return-26->
> Murphy.Sean=mayo.edu@incubator.apache.o<ma...@incubator.apache.o>
> >> r
> >> g] On Behalf Of Coarr, Matt
> >> Sent: Thursday, November 15, 2012 12:22 PM
> >> To: ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
> >> Cc: ctakes-user@incubator.apache.org<ma...@incubator.apache.org>
> >> Subject: Re: Regarding Assertion Tagger
> >>
> >> You were looking in the right place Deepal! The "cTAKES 2.5 Component
> Use Guide" (the link at the bottom of your email) has a link to more
> information about the assertion module.
> >>
> >> Assertion module info:
> >>
> >> https://wiki.nci.nih.gov/display/VKC/cTAKES+2.5+-+Assertion
> >>
> >> I'm not familiar with the peripheral artery disease spotter or the simulated
> prod smoking tae.  So I'll let someone else chime in there.
> >>
> >> Matt
> >>
> >>
> >> On 2012-11-15 13:11 , "Deepal Dhariwal" <de...@gmail.com>>
> wrote:
> >>
> >>> Hello Matt,
> >>>
> >>> Thanks for your reply. I am using cTAKES-2.5.0 Binary Version which
> >>> I have downloaded from
> >>> https://wiki.nci.nih.gov/display/VKC/cTAKES+2.5+User+Install+Instruc
> >>> t
> >>> io
> >>> ns
> >>> .
> >>> I have gone through the cTAKES documentation however no where was
> it
> >>> mentioned that polarity / uncertainty properties are on Entity Mention.
> >>> In order to avoid sending repeated mails to the mailing list could
> >>> you tell me if there is some other documentation as well ? I am
> >>> trying to use Peripheral Artery Disease Spotter , however it returns
> >>> only document annotation. Further even the SimulatedProdSmokingTAE
> >>> Annotator returns smoking status 'unknown' for every input. Is there
> >>> some order in which these annotator need to be executed (Reference :
> >>>
> https://wiki.nci.nih.gov/display/VKC/cTAKES+2.5+Component+Use+Guide
> >>> )
> >>>
> >>> Thanks for clarifying the user list email id.
> >>>
> >>> Regards
> >>> Deepal Dhariwal
> >>>
> >>> On Thu, Nov 15, 2012 at 12:48 PM, Coarr, Matt <mc...@mitre.org>>
> wrote:
> >>>
> >>>> FYI, the user list is ctakes-user (singular).  I've corrected the CC.
> >>>>
> >>>> The polarity/conditional/uncertainty properties are on
> >>>> EntityMention and EventMention.
> >>>>
> >>>> Are you using a current development copy of ctakes (from apache svn
> >>>> or from 3.0 RC2)?
> >>>>
> >>>> If not, what version of ctakes are you using?  Version number?
> >>>> Binary, source zip, or source from svn?
> >>>>
> >>>> Matt
> >>>>
> >>>>
> >>


RE: DrugNER & PAD Term Spotter (was: Regarding Assertion Tagger)

Posted by "Murphy, Sean P." <Mu...@mayo.edu>.
Thanks Pei,
                Agreed.  I believe there was an ant build script built by our team that provided an automated means to change the locations when creating the iCTAKES build  from the separate cTAKES projects.   I wasn't sure the nature of the changes to the Maven driven builds, so that's where I misunderstood where and how the changes were to be made.
                If okay with everyone I will search all the projects using the old implementation of the import location relative paths, update and test against 3.0.
                Thanks,
                                ~Sean


From: ctakes-user-return-39-Murphy.Sean=mayo.edu@incubator.apache.org [mailto:ctakes-user-return-39-Murphy.Sean=mayo.edu@incubator.apache.org] On Behalf Of Chen, Pei
Sent: Friday, November 30, 2012 12:55 PM
To: ctakes-user@incubator.apache.org; 'Deepal Dhariwal'; 'ctakes-dev@incubator.apache.org'
Subject: RE: DrugNER & PAD Term Spotter (was: Regarding Assertion Tagger)

Hi Sean,
It should load the type system by name instead of path now.
The xml file should:
<import name="org.apache.ctakes.typesystem.types.TypeSystem"/>
<import name="org.apache.ctakes.drugner.types.TypeSystem"/>

Instead of:
<import location="../type_system/NERTypeSystem.xml"/>

While testing this, I also noticed that the chunker classname in DrugAggregatePlaintextProcessor.xml should be:
org.apache.ctakes.chunker.ae.PhraseTypeChunkCreator
instead of:
edu.mayo.bmi.uima.chunker.PhraseTypeChunkCreator

--Pei

From: Murphy, Sean P. [mailto:Murphy.Sean@mayo.edu]
Sent: Friday, November 30, 2012 1:13 PM
To: 'Deepal Dhariwal'; 'ctakes-dev@incubator.apache.org'
Cc: '<ct...@incubator.apache.org>>'
Subject: RE: Regarding Assertion Tagger

I'm able to reproduce the problem on 3.0 in either the prebuild zip bin or the Maven built environment.  The underlying issue appears to be related to the paths not being correctly handled.   I will investigate further.

Caused by: org.apache.uima.util.InvalidXMLException: Import failed.  Could not read from URL file:/C:/tools/cTAKESv3.0_bin/desc/ctakes-drug-ner/desc/type_system/NERTypeSystem.xml. (Descriptor: file:/C:/tools/cTAKESv3.0_bin/desc/ctakes-drug-ner/desc/analysis_engine/DrugCNP2LookupWindow.xml)
                at org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl.resolveImports(TypeSystemDescription_impl.java:231)
                at org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl.resolveImports(TypeSystemDescription_impl.java:207)
                at org.apache.uima.analysis_engine.metadata.impl.AnalysisEngineMetaData_impl.resolveImports(AnalysisEngineMetaData_impl.java:87)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveImports(AnalysisEngineDescription_impl.java:741)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveDelegateAnalysisEngineImports(AnalysisEngineDescription_impl.java:827)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveImports(AnalysisEngineDescription_impl.java:733)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveDelegateAnalysisEngineImports(AnalysisEngineDescription_impl.java:827)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.resolveDelegateAnalysisEngineImports(AnalysisEngineDescription_impl.java:765)
                at org.apache.uima.analysis_engine.impl.AnalysisEngineDescription_impl.getDelegateAnalysisEngineSpecifiers(AnalysisEngineDescription_impl.java:193)
                at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:147)
                ... 45 more
Caused by: java.io.FileNotFoundException: C:\tools\cTAKESv3.0_bin\desc\ctakes-drug-ner\desc\type_system\NERTypeSystem.xml (The system cannot find the path specified)
                at java.io.FileInputStream.open(Native Method)
                at java.io.FileInputStream.<init>(Unknown Source)
                at java.io.FileInputStream.<init>(Unknown Source)
                at sun.net.www.protocol.file.FileURLConnection.connect(Unknown Source)
                at sun.net.www.protocol.file.FileURLConnection.getInputStream(Unknown Source)
                at org.apache.uima.util.XMLInputSource.<init>(XMLInputSource.java:120)
                at org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl.resolveImport(TypeSystemDescription_impl.java:263)
                at org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl.resolveImports(TypeSystemDescription_impl.java:229)
                ... 54 more
                Thanks,
                                ~Sean

From: Murphy, Sean P.
Sent: Friday, November 30, 2012 8:58 AM
To: 'Deepal Dhariwal'; ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
Cc: <ct...@incubator.apache.org>>
Subject: RE: Regarding Assertion Tagger

Deepal,
               Thanks for verifying #1.   It took a bit longer to setup my environment on a test machine to verify, but please bear with me while I run through a regime of tests regarding these pipelines.

#2. You may need to increase the pool size used by the VM arguments for your java environment.  I believe the default is " -Xms1024M -Xmx2048M".   If your system has the resources you may want to increase these by 1GM and retest.  I do not believe this will improve the time to run, however.    Perhaps someone else has some suggestions regarding this aspect(?).

#3.  I will have to defer this question to the rest of the team.


From: Deepal Dhariwal [mailto:deepaldhariwal@gmail.com]
Sent: Thursday, November 29, 2012 8:47 PM
To: ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
Cc: Murphy, Sean P.; <ct...@incubator.apache.org>>
Subject: Re: Regarding Assertion Tagger

Hello all,

1. I downloaded ctakes 3.0 and was trying the PAD term Spotter and Drug NER lookup annotator, but I am getting Resource Initialization Exception.I have valid UMLS license and I have added username , password in UMLS Lookup Annotator.   I have been following the thread on PAD Term Spotter bug in ctakes 2.5 and I wanted to know whether it has been resolved in ctakes 3.0
2. Further my input data set is 4 MB. When I run Collection Processing Engine on the data set I get java.lang.outofmemory java heap space exception. Is there any way I could resolve this error and also reduce the time taken to execute on such huge data set.
3. Lastly as part of my thesis I am working on extracting cardio vascular terms from medical text using ctakes and umls vocabulary. I want to map these terms to existing medical owl ontologies for example, UMLS Semantic Network. I wanted to know whether ctakes community is thinking including medical ontology feature in ctakes.

Thanks
Deepal Dhariwal


On Mon, Nov 26, 2012 at 12:25 PM, Chen, Pei <Pe...@childrens.harvard.edu>> wrote:
Thanks Sean,
If the issue was just a descriptor path issue, then it was probably already fixed in 3.0 as part of the ASF move.
We can just verify and test it?

--Pei

> -----Original Message-----
> From: Murphy, Sean P. [mailto:Murphy.Sean@mayo.edu<ma...@mayo.edu>]
> Sent: Monday, November 26, 2012 12:21 PM
> To: Chen, Pei
> Cc: ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>; <ct...@incubator.apache.org>>
> Subject: RE: Regarding Assertion Tagger
>
> Hello Pei,
>       I have created a bug for the 3.0 branch as well.   However, since the
> problem is related to the relative path structures being incorrectly migrated
> to the updated format,  I am not sure if the fix should be made to the
> sourceforge 2.5 version only or all releases.    The maven driven build changes
> appear to be consolidating some of these issues, but won't be in place until
> the 3.0 build has finalized.    If so, and please correct me if I'm wrong, then:
>       1) There is no need to fix at 2.6. and
>       2) The fix checked in at 3.0 would be the based on the old directory
> structure.
>       Thanks,
>               ~Sean
>
> -----Original Message-----
> From: ctakes-user-return-31-
> Murphy.Sean=mayo.edu@incubator.apache.org<ma...@incubator.apache.org> [mailto:ctakes-user-return-<mailto:ctakes-user-return->
> 31-Murphy.Sean=mayo.edu@incubator.apache.org<ma...@incubator.apache.org>] On Behalf Of Chen, Pei
> Sent: Thursday, November 15, 2012 3:45 PM
> To: <ct...@incubator.apache.org>>
> Cc: <ct...@incubator.apache.org>>; ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
> Subject: Re: Regarding Assertion Tagger
>
> There's a 3.0.0 branch.  The release will be made from there.  So we should
> make the fixes in both trunk and 3.0.0.
>
>
> On Nov 15, 2012, at 10:32 PM, "Murphy, Sean P."
> <Mu...@mayo.edu>> wrote:
>
> > Hello Pei,
> >    The issue is at 2.5.   When is the 3.0 release freeze?   I will try to propagate
> the fix forward.
> >
> > -----Original Message-----
> > From: ctakes-user-return-29-
> Murphy.Sean=mayo.edu@incubator.apache.org<ma...@incubator.apache.org>
> > [mailto:ctakes-user-return-29-<mailto:ctakes-user-return-29->
> Murphy.Sean=mayo.edu@incubator.apache.or<ma...@incubator.apache.or>
> > g] On Behalf Of Chen, Pei
> > Sent: Thursday, November 15, 2012 3:15 PM
> > To: <ct...@incubator.apache.org>>
> > Cc: ctakes-user@incubator.apache.org<ma...@incubator.apache.org>; ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
> > Subject: Re: Regarding Assertion Tagger
> >
> > Hi Sean,
> > What was the issue in 2.5?  Just want to make sure this is also fixed in the
> upcoming 3.0 release coming out of ASF as well... Jira#?
> >
> >
> >
> > On Nov 15, 2012, at 7:59 PM, "Murphy, Sean P."
> <Mu...@mayo.edu>> wrote:
> >
> >> I was able to see an issue with the 'PAD term spotter' which will most
> likely be related to the problem you're seeing with the smoking status as
> well.    The problem seems to have  stemmed from the reorganization of the
> path structures with the latest cTAKES  release.  Due to time and resource
> constraints we were not able to test each project independently.
> >>
> >> I will open a bug report against these problems and provide a fix as soon
> as possible.  I will keep you posted, but I hope to have this resolved in a few
> days.
> >>   Thanks,
> >>       ~Sean
> >>
> >> -----Original Message-----
> >> From: ctakes-user-return-26-
> Murphy.Sean=mayo.edu@incubator.apache.org<ma...@incubator.apache.org>
> >> [mailto:ctakes-user-return-26-<mailto:ctakes-user-return-26->
> Murphy.Sean=mayo.edu@incubator.apache.o<ma...@incubator.apache.o>
> >> r
> >> g] On Behalf Of Coarr, Matt
> >> Sent: Thursday, November 15, 2012 12:22 PM
> >> To: ctakes-dev@incubator.apache.org<ma...@incubator.apache.org>
> >> Cc: ctakes-user@incubator.apache.org<ma...@incubator.apache.org>
> >> Subject: Re: Regarding Assertion Tagger
> >>
> >> You were looking in the right place Deepal! The "cTAKES 2.5 Component
> Use Guide" (the link at the bottom of your email) has a link to more
> information about the assertion module.
> >>
> >> Assertion module info:
> >>
> >> https://wiki.nci.nih.gov/display/VKC/cTAKES+2.5+-+Assertion
> >>
> >> I'm not familiar with the peripheral artery disease spotter or the simulated
> prod smoking tae.  So I'll let someone else chime in there.
> >>
> >> Matt
> >>
> >>
> >> On 2012-11-15 13:11 , "Deepal Dhariwal" <de...@gmail.com>>
> wrote:
> >>
> >>> Hello Matt,
> >>>
> >>> Thanks for your reply. I am using cTAKES-2.5.0 Binary Version which
> >>> I have downloaded from
> >>> https://wiki.nci.nih.gov/display/VKC/cTAKES+2.5+User+Install+Instruc
> >>> t
> >>> io
> >>> ns
> >>> .
> >>> I have gone through the cTAKES documentation however no where was
> it
> >>> mentioned that polarity / uncertainty properties are on Entity Mention.
> >>> In order to avoid sending repeated mails to the mailing list could
> >>> you tell me if there is some other documentation as well ? I am
> >>> trying to use Peripheral Artery Disease Spotter , however it returns
> >>> only document annotation. Further even the SimulatedProdSmokingTAE
> >>> Annotator returns smoking status 'unknown' for every input. Is there
> >>> some order in which these annotator need to be executed (Reference :
> >>>
> https://wiki.nci.nih.gov/display/VKC/cTAKES+2.5+Component+Use+Guide
> >>> )
> >>>
> >>> Thanks for clarifying the user list email id.
> >>>
> >>> Regards
> >>> Deepal Dhariwal
> >>>
> >>> On Thu, Nov 15, 2012 at 12:48 PM, Coarr, Matt <mc...@mitre.org>>
> wrote:
> >>>
> >>>> FYI, the user list is ctakes-user (singular).  I've corrected the CC.
> >>>>
> >>>> The polarity/conditional/uncertainty properties are on
> >>>> EntityMention and EventMention.
> >>>>
> >>>> Are you using a current development copy of ctakes (from apache svn
> >>>> or from 3.0 RC2)?
> >>>>
> >>>> If not, what version of ctakes are you using?  Version number?
> >>>> Binary, source zip, or source from svn?
> >>>>
> >>>> Matt
> >>>>
> >>>>
> >>