You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@ctakes.apache.org by "James Joseph Masanz (JIRA)" <ji...@apache.org> on 2017/04/03 20:23:41 UTC

[jira] [Closed] (CTAKES-96) Update Dependency Parser and Semantic Role Labeler - Thanks Jinho Choi and Lee Beecker

     [ https://issues.apache.org/jira/browse/CTAKES-96?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

James Joseph Masanz closed CTAKES-96.
-------------------------------------

Closing this issue that was resolved a while ago.

> Update Dependency Parser and Semantic Role Labeler - Thanks Jinho Choi and Lee Beecker
> --------------------------------------------------------------------------------------
>
>                 Key: CTAKES-96
>                 URL: https://issues.apache.org/jira/browse/CTAKES-96
>             Project: cTAKES
>          Issue Type: New Feature
>    Affects Versions: 3.0-incubating
>            Reporter: Pei Chen
>            Assignee: Pei Chen
>             Fix For: 3.1.0
>
>
> Update/create new wrappers for ClearNLP that have been trained on clinical notes (SHARP/MiPACQ).
> Some notes:
> the integration will be mostly switching to cTAKES types.
> Here are a few critical spots:
> In the tokenizer (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/main/java/org/cleartk/clearnlp/Tokenizer.java), lines 96 and 106 are all that should need changing to switch to cTAKES Sentence and Token types.
> In the pos-tagger (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/main/java/org/cleartk/clearnlp/PosTagger.java) most of the changes should be lines 109 and 116-118
> In the MP Analyzer (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/main/java/org/cleartk/clearnlp/MPAnalyzer.java) the changes would be lines 122-124 to again use the cTAKES toke types.
> The Dependency Parser (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/main/java/org/cleartk/clearnlp/DependencyParser.java) is a bit harder, but similar.  I think you can step through and find instances of ClearTK types and swap them for the Dependency Relation types in cTAKES.  Basically the code grabs the token, POS, and lemma data from the CAS and passes it onto Jinho's SRL.  Then the work is in mapping that output back into CAS appropriate types.
> The Semantic Role Labeler (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/main/java/org/cleartk/clearnlp/SemanticRoleLabeler.java) follows a similar flow.  But also pulls out Dependency Parse information from the CAS.  Then the work is in extracting the SRL arguments and predicates to put back into ClearTK CAS types.
> Lastly to get any idea of how these components are called in a UIMA pipeline, I would refer to the test cases, especailly the ClearNLP test case (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/test/java/org/cleartk/clearnlp/ClearNLPTest.java)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)