You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ctakes.apache.org by Devikiran Ramadas <de...@gmail.com> on 2017/04/05 09:38:09 UTC

Regarding Negation, Uncertainty Pipes

Hi,

I have been looking into the cTAKES code base for sometime now.. and have
few confusions related to identifying Negation, Uncertainty and other
properties for the identified named entity.

It can be done by :
1) Assertion module
      - Cleartk based (neg, Uncertainty, subject, history, generic etc.)
      - Non Cleartk (generic, subject )
2) Context Annotator (Negation, Uncertainty and history )

I have been using Context Annotator but saw usage of the Cleartk based
pipes even in the svn trunk version of Clinicalpipelinefactory class.

Could some one please clear the air on what pipe is recommended for each of
the properties of Identified Annotation?

Regards,
Devi

Re: Regarding Negation, Uncertainty Pipes

Posted by "Miller, Timothy" <Ti...@childrens.harvard.edu>.
Thanks for that perspective, Yiming.

I contributed to the ClearTK version of the system. At that time we
evaluated it for negation [1] and found that it was more generalizable
than the rule-based negation detectors like Negex. Since then, we've
found on some projects that Negex is easier to modify for new data
sets, simply by tuning a parameter for how wide of a window to look at.
It also is easier to explain to physicians who may not be NLP savvy,
since its rules are simple and its mistakes are easier to explain. It
also seems that the ClearTK system was biased towards precision and
Negex towards recall, and our collaborators seemed more comfortable
with the recall bias. All that said, we don't have exhaustive
comparisons across attributes (uncertainty, generic, hypothetical,
subject, conditional) because performance along those other attributes
with any system is quite often below what would be considered
acceptable for presenting to users, and so any use of those is almost
beta testing.

I believe the default system is not changing in the upcoming release,
but I think the rule-based system will stick around and so you can
switch to it if the default does not work on your data.

Hope that is helpful.

Tim

[1] http://journals.plos.org/plosone/article?id=10.1371/journal.pone.01
12774

On Wed, 2017-04-05 at 13:30 -0400, Zuo Yiming wrote:
> Hi Devi,
> 
> My feeling is CleartK based pipelines are the currently recommended
> ones to use. They have specific pipeline for each property you may
> want to detect (neg, uncertainty, subject, etc), thus providing more
> concise and understandable coding style. 
> 
> From my personal experience, Assertion Non CleartK pipelines work
> well in detecting generic and subject properties while I have
> difficulty putting CleartK based pipelines to work for these two
> properties on user mode. On developer mode, I was not able to apply
> Assertion Non CleartK pipelines due to some errors, so it looks
> CleartK based pipelines are the only choices you have. I don’t think
> context annotator are still recommended now.
> 
> Again, that’s only my feedback, welcome more discussion.
> 
> Best,
> Yiming   
> 
> > 
> > On Apr 5, 2017, at 5:38 AM, Devikiran Ramadas <devikiranr@gmail.com
> > > wrote:
> > 
> > Hi,
> > 
> > I have been looking into the cTAKES code base for sometime now..
> > and have
> > few confusions related to identifying Negation, Uncertainty and
> > other
> > properties for the identified named entity.
> > 
> > It can be done by :
> > 1) Assertion module
> >      - Cleartk based (neg, Uncertainty, subject, history, generic
> > etc.)
> >      - Non Cleartk (generic, subject )
> > 2) Context Annotator (Negation, Uncertainty and history )
> > 
> > I have been using Context Annotator but saw usage of the Cleartk
> > based
> > pipes even in the svn trunk version of Clinicalpipelinefactory
> > class.
> > 
> > Could some one please clear the air on what pipe is recommended for
> > each of
> > the properties of Identified Annotation?
> > 
> > Regards,
> > Devi

Re: Regarding Negation, Uncertainty Pipes

Posted by Zuo Yiming <yi...@gmail.com>.
Hi Devi,

My feeling is CleartK based pipelines are the currently recommended ones to use. They have specific pipeline for each property you may want to detect (neg, uncertainty, subject, etc), thus providing more concise and understandable coding style. 

From my personal experience, Assertion Non CleartK pipelines work well in detecting generic and subject properties while I have difficulty putting CleartK based pipelines to work for these two properties on user mode. On developer mode, I was not able to apply Assertion Non CleartK pipelines due to some errors, so it looks CleartK based pipelines are the only choices you have. I don’t think context annotator are still recommended now.

Again, that’s only my feedback, welcome more discussion.

Best,
Yiming   

> On Apr 5, 2017, at 5:38 AM, Devikiran Ramadas <de...@gmail.com> wrote:
> 
> Hi,
> 
> I have been looking into the cTAKES code base for sometime now.. and have
> few confusions related to identifying Negation, Uncertainty and other
> properties for the identified named entity.
> 
> It can be done by :
> 1) Assertion module
>      - Cleartk based (neg, Uncertainty, subject, history, generic etc.)
>      - Non Cleartk (generic, subject )
> 2) Context Annotator (Negation, Uncertainty and history )
> 
> I have been using Context Annotator but saw usage of the Cleartk based
> pipes even in the svn trunk version of Clinicalpipelinefactory class.
> 
> Could some one please clear the air on what pipe is recommended for each of
> the properties of Identified Annotation?
> 
> Regards,
> Devi