You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "William Colen (JIRA)" <ji...@apache.org> on 2011/07/27 18:21:09 UTC

[jira] [Commented] (OPENNLP-231) POS Tagger cross validator tool is not evaluating models that includes ngram dictionaries.

    [ https://issues.apache.org/jira/browse/OPENNLP-231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071835#comment-13071835 ] 

William Colen commented on OPENNLP-231:
---------------------------------------

The ngram dictionary is created from the sample data. The POSTaggerCrossValidator class expects a ngram dictionary in its constructor, but if we create this dictionary using the entire sample and send it to the POSTaggerCrossValidator it would be an unfair evaluation.
Instead of passing the ngram dictionary we should pass the cutoff and let the evaluate method create the dictionary using the training sample.

> POS Tagger cross validator tool is not evaluating models that includes ngram dictionaries.
> ------------------------------------------------------------------------------------------
>
>                 Key: OPENNLP-231
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-231
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Command Line Interface, POS Tagger
>    Affects Versions: tools-1.5.2-incubating
>            Reporter: William Colen
>            Assignee: William Colen
>            Priority: Minor
>             Fix For: tools-1.5.2-incubating
>
>
> The parameter -ngram is present on POS Tagger trainer tool, but it is not present on CV tool.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira