You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ctakes.apache.org by "Miller, Timothy" <Ti...@childrens.harvard.edu> on 2021/01/24 17:34:02 UTC

Re: neural negation model in ctakes [EXTERNAL]

Peter, I'd be happy to try it, especially if it's made easy with a ctakes module! At the very least that sounds like it would be a good baseline comparison to use if we are benchmarking new ML methods. We have several datasets available internally that are not widely available in the research community.
Tim

________________________________________
From: Peter Abramowitsch <pa...@gmail.com>
Sent: Sunday, January 24, 2021 12:05 PM
To: dev@ctakes.apache.org
Subject: Re: neural negation model in ctakes [EXTERNAL]

* External Email - Caution *


Thats great Tim - it sounds very sophisticated!

In fact I had made some changes to the Negex Annotator a last fall which I
hadn't checked in but was waiting for Sean to test.  In a great deal of my
own testing I discovered that Negex, which is easily expandable to
accommodate new constructions, had only a couple of serious flaws and I
believe I have fixed these, as well as a performance issue it had.   If
you're interested in testing it up against yours that would be great.
Reading your description above, I wondered how it would do in the case of
strings of entities which were negated by a single negating trigger phrase
either ahead or behind the series.  Or what happens when a series of
entities which begins as all being negated has one expressed in a way that
stops the negation pattern.  These are the weaknesses I addressed in my
changes.

Regards
Peter

On Sun, Jan 24, 2021 at 5:08 PM Miller, Timothy <
Timothy.Miller@childrens.harvard.edu> wrote:

> Hi all,
> I just checked in a usable proof-of-concept for a neural (RoBERTa-based to
> be specific) negation classifier. The way it works is a tiny bit of python
> code (using FastAPI) sets up a REST interface that runs the classifier:
> ctakes-assertion/src/main/python/negation_rest.py
>
> it runs a default model that I trained and uploaded into Huggingface
> modelhub. It will automatically download the first time the server is run.
>
> there is a startup script there too:
> ctakes-assertion/src/main/python/start_negation_rest.sh
>
> The idea would be to run this on whatever machine you have with the
> appropriate GPU resources and it creates 3 REST endpoints:
> /negation/initialize  -- to load the model (takes longer the first time as
> it will download)
> /negation/process -- to classify the data and return negation values
> /negation/collection_process_complete -- to unload the model
>
> to mirror UIMA workflows. Then, the UIMA analysis engine sits in:
>
> ctakes-assertion/src/main/java/org/apache/ctakes/assertion/ae/PolarityBertRestAnnotator.java
>
> The main work here is converting the cTAKES entities/events into a simpler
> data structure that gets sent to the python REST server, making the REST
> call, and then converting the classifier output into the polarity property.
>
> Performance:
> The accuracy of this classifier is much better in my testing. I am looking
> forward to being able to hopefully make the path to improving the
> performance easier as it can potentially just be a change to the model
> string to have it grab a new model on modelhub.
>
> The speed is marginally slower if we do a 1-for-1 swap, but that's a
> little bit misleading, because we currently run 2 parsers to generate
> features for the default ML negation module. If we don't need those parsers
> we can dramatically cut the speed of the processing even with the neural
> negation module. I tested this with the python code running on a machine
> with a 1070ti. The goal for these methods going forward if we want to scale
> should be to have the neural call do a few things with a single pass,
> especially if we are using large transformer models. But this proof of
> concept of a single task will hopefully make it easier for other folks to
> do that if they wish.
>
> FYI, another way of doing this is by using python libraries like cassis
> and actually having python functions be essentially UIMA AEs -- I think
> there will be a place for both approaches and I'm not trying to wall off
> work in that direction.
>
> Tim
>
>