You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@opennlp.apache.org by "Jörn Kottmann (JIRA)" <ji...@apache.org> on 2011/06/24 13:30:47 UTC

[jira] [Commented] (OPENNLP-204) UIMA POSTaggerTrainer wrongly parses token annotations

    [ https://issues.apache.org/jira/browse/OPENNLP-204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054376#comment-13054376 ] 

Jörn Kottmann commented on OPENNLP-204:
---------------------------------------

Thanks for pointing it out, do you mind to attach a patch file to this issue?

> UIMA POSTaggerTrainer wrongly parses token annotations
> ------------------------------------------------------
>
>                 Key: OPENNLP-204
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-204
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: POS Tagger, UIMA Integration
>    Affects Versions: tools-1.5.1-incubating
>            Reporter: Nicolas Hernandez
>             Fix For: tools-1.5.2-incubating
>
>
> Affects the opennlp-uima package, in particular the opennlp/uima/postag/POSTaggerTrainer.java class.
> This AE is expected to parse token annotations and to build two data structures. The first one is an array of the token coveredTexts and the second an array of associated tags (the tags are specified by a feature structure path set in parameter). 
> In practice, the tag value of the current token is wrongly added to the token array. 
> This can be easily solved by changing the name of the data structure: from `tokens` to `tags` at line 200.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira