You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@opennlp.apache.org by "Nicolas Hernandez (JIRA)" <ji...@apache.org> on 2011/06/24 13:28:47 UTC

[jira] [Created] (OPENNLP-204) UIMA POSTaggerTrainer wrongly parses token annotations

UIMA POSTaggerTrainer wrongly parses token annotations
------------------------------------------------------

                 Key: OPENNLP-204
                 URL: https://issues.apache.org/jira/browse/OPENNLP-204
             Project: OpenNLP
          Issue Type: Bug
          Components: POS Tagger, UIMA Integration
    Affects Versions: tools-1.5.1-incubating
            Reporter: Nicolas Hernandez
             Fix For: tools-1.5.2-incubating


Affects the opennlp-uima package, in particular the opennlp/uima/postag/POSTaggerTrainer.java class.

This AE is expected to parse token annotations and to build two data structures. The first one is an array of the token coveredTexts and the second an array of associated tags (the tags are specified by a feature structure path set in parameter). 

In practice, the tag value of the current token is wrongly added to the token array. 

This can be easily solved by changing the name of the data structure: from `tokens` to `tags` at line 200.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (OPENNLP-204) UIMA POSTaggerTrainer wrongly parses token annotations

Posted by "Jörn Kottmann (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/OPENNLP-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jörn Kottmann resolved OPENNLP-204.
-----------------------------------

    Resolution: Fixed
      Assignee: Jörn Kottmann

It is fixed as suggested, can you please test and close the issue to confirm your positive test outcome.

> UIMA POSTaggerTrainer wrongly parses token annotations
> ------------------------------------------------------
>
>                 Key: OPENNLP-204
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-204
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: POS Tagger, UIMA Integration
>    Affects Versions: tools-1.5.1-incubating
>            Reporter: Nicolas Hernandez
>            Assignee: Jörn Kottmann
>             Fix For: tools-1.5.2-incubating
>
>
> Affects the opennlp-uima package, in particular the opennlp/uima/postag/POSTaggerTrainer.java class.
> This AE is expected to parse token annotations and to build two data structures. The first one is an array of the token coveredTexts and the second an array of associated tags (the tags are specified by a feature structure path set in parameter). 
> In practice, the tag value of the current token is wrongly added to the token array. 
> This can be easily solved by changing the name of the data structure: from `tokens` to `tags` at line 200.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (OPENNLP-204) UIMA POSTaggerTrainer wrongly parses token annotations

Posted by "Jörn Kottmann (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/OPENNLP-204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054376#comment-13054376 ] 

Jörn Kottmann commented on OPENNLP-204:
---------------------------------------

Thanks for pointing it out, do you mind to attach a patch file to this issue?

> UIMA POSTaggerTrainer wrongly parses token annotations
> ------------------------------------------------------
>
>                 Key: OPENNLP-204
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-204
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: POS Tagger, UIMA Integration
>    Affects Versions: tools-1.5.1-incubating
>            Reporter: Nicolas Hernandez
>             Fix For: tools-1.5.2-incubating
>
>
> Affects the opennlp-uima package, in particular the opennlp/uima/postag/POSTaggerTrainer.java class.
> This AE is expected to parse token annotations and to build two data structures. The first one is an array of the token coveredTexts and the second an array of associated tags (the tags are specified by a feature structure path set in parameter). 
> In practice, the tag value of the current token is wrongly added to the token array. 
> This can be easily solved by changing the name of the data structure: from `tokens` to `tags` at line 200.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (OPENNLP-204) UIMA POSTaggerTrainer wrongly parses token annotations

Posted by "Jörn Kottmann (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/OPENNLP-204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054378#comment-13054378 ] 

Jörn Kottmann commented on OPENNLP-204:
---------------------------------------

Ahhh, thats very simple, I will just change the line.

> UIMA POSTaggerTrainer wrongly parses token annotations
> ------------------------------------------------------
>
>                 Key: OPENNLP-204
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-204
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: POS Tagger, UIMA Integration
>    Affects Versions: tools-1.5.1-incubating
>            Reporter: Nicolas Hernandez
>             Fix For: tools-1.5.2-incubating
>
>
> Affects the opennlp-uima package, in particular the opennlp/uima/postag/POSTaggerTrainer.java class.
> This AE is expected to parse token annotations and to build two data structures. The first one is an array of the token coveredTexts and the second an array of associated tags (the tags are specified by a feature structure path set in parameter). 
> In practice, the tag value of the current token is wrongly added to the token array. 
> This can be easily solved by changing the name of the data structure: from `tokens` to `tags` at line 200.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (OPENNLP-204) UIMA POSTaggerTrainer wrongly parses token annotations

Posted by "Nicolas Hernandez (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/OPENNLP-204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054383#comment-13054383 ] 

Nicolas Hernandez commented on OPENNLP-204:
-------------------------------------------

Confirmed.

> UIMA POSTaggerTrainer wrongly parses token annotations
> ------------------------------------------------------
>
>                 Key: OPENNLP-204
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-204
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: POS Tagger, UIMA Integration
>    Affects Versions: tools-1.5.1-incubating
>            Reporter: Nicolas Hernandez
>            Assignee: Jörn Kottmann
>             Fix For: tools-1.5.2-incubating
>
>
> Affects the opennlp-uima package, in particular the opennlp/uima/postag/POSTaggerTrainer.java class.
> This AE is expected to parse token annotations and to build two data structures. The first one is an array of the token coveredTexts and the second an array of associated tags (the tags are specified by a feature structure path set in parameter). 
> In practice, the tag value of the current token is wrongly added to the token array. 
> This can be easily solved by changing the name of the data structure: from `tokens` to `tags` at line 200.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Closed] (OPENNLP-204) UIMA POSTaggerTrainer wrongly parses token annotations

Posted by "Nicolas Hernandez (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/OPENNLP-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Hernandez closed OPENNLP-204.
-------------------------------------


The fix solves the problem as expected.

> UIMA POSTaggerTrainer wrongly parses token annotations
> ------------------------------------------------------
>
>                 Key: OPENNLP-204
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-204
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: POS Tagger, UIMA Integration
>    Affects Versions: tools-1.5.1-incubating
>            Reporter: Nicolas Hernandez
>            Assignee: Jörn Kottmann
>             Fix For: tools-1.5.2-incubating
>
>
> Affects the opennlp-uima package, in particular the opennlp/uima/postag/POSTaggerTrainer.java class.
> This AE is expected to parse token annotations and to build two data structures. The first one is an array of the token coveredTexts and the second an array of associated tags (the tags are specified by a feature structure path set in parameter). 
> In practice, the tag value of the current token is wrongly added to the token array. 
> This can be easily solved by changing the name of the data structure: from `tokens` to `tags` at line 200.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira