You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Rodrigo Agerri (JIRA)" <ji...@apache.org> on 2014/05/10 23:55:24 UTC
[jira] [Commented] (OPENNLP-31) Add evaluation support to the parser component

    [ https://issues.apache.org/jira/browse/OPENNLP-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13993558#comment-13993558 ] 

Rodrigo Agerri commented on OPENNLP-31:
---------------------------------------

The evaluator now works correctly, well, it works like EVALB, tested on section 23 of penn treebank (2416 parses) and 3LB-CAST (3K parses) of Ancora.

The evaluator now adds every constituent to the initial stack for evaluation, otherwise not complete parses are not considered, e.g., no Span is added (it stops in the  first preterminal). To calculate the FMeasure I remove a prediction Span after being matched with a reference Span, otherwise the recall gets higher due to one prediction being matched more than once. 

For the changes in the FMeasure class, I have added a new provisional ParseEval class, which only differs from the FMeasure class in that it removes a predicted Span once it has been matched to a reference Span, but I think an issue should be opened for this. Incidentally, I have compiled with these changes in the FMeasure and all test pass, so this should be trivial. 

Note that now ParserEvaluator behaves like evalb with no exceptions, e.g, without removing punctuation tags, or equality between ADVP and PRT (as in COLLINS param file). To follow COLLINS convention for EVALB we could add options
to the ParserEvaluator. In this way, more than one convention (e.g., Bikel, Charniak) can be used. 

What do you think about: 

- Open issue to add changes in FMeasure to remove from a predicted match to avoid matching more than once the same span and remove provisional ParseEval class accordingly. 
- Open issue to add COLLINS parameters for evaluation.
- Issue for a test case for ParserEvaluator.
- Issue for a ParserEvaluator documentation.

Rodrigo

> Add evaluation support to the parser component
> ----------------------------------------------
>
>                 Key: OPENNLP-31
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-31
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Parser
>            Reporter: Joern Kottmann
>            Assignee: Joern Kottmann
>             Fix For: 1.6.0
>
>         Attachments: tree-1.gold, tree-1.log, tree-1.test, tree-2.gold, tree-2.log, tree-2.test
>
>
> Add support for evaluating the tagging performance of the parser.
> Issues is migrated from SourceForge:
> https://sourceforge.net/tracker/?func=detail&aid=2557289&group_id=3368&atid=353368



--
This message was sent by Atlassian JIRA
(v6.2#6252)