You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "William Colen (JIRA)" <ji...@apache.org> on 2011/07/18 22:23:57 UTC

[jira] [Created] (OPENNLP-231) POS Tagger cross validator tool is not evaluating models that includes ngram dictionaries.

POS Tagger cross validator tool is not evaluating models that includes ngram dictionaries.
------------------------------------------------------------------------------------------

                 Key: OPENNLP-231
                 URL: https://issues.apache.org/jira/browse/OPENNLP-231
             Project: OpenNLP
          Issue Type: Improvement
          Components: Command Line Interface, POS Tagger
    Affects Versions: tools-1.5.2-incubating
            Reporter: William Colen
            Assignee: William Colen
            Priority: Minor
             Fix For: tools-1.5.2-incubating


The parameter -ngram is present on POS Tagger trainer tool, but it is not present on CV tool.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OPENNLP-231) POS Tagger cross validator tool is not evaluating models that includes ngram dictionaries.

Posted by "William Colen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071835#comment-13071835 ] 

William Colen commented on OPENNLP-231:
---------------------------------------

The ngram dictionary is created from the sample data. The POSTaggerCrossValidator class expects a ngram dictionary in its constructor, but if we create this dictionary using the entire sample and send it to the POSTaggerCrossValidator it would be an unfair evaluation.
Instead of passing the ngram dictionary we should pass the cutoff and let the evaluate method create the dictionary using the training sample.

> POS Tagger cross validator tool is not evaluating models that includes ngram dictionaries.
> ------------------------------------------------------------------------------------------
>
>                 Key: OPENNLP-231
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-231
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Command Line Interface, POS Tagger
>    Affects Versions: tools-1.5.2-incubating
>            Reporter: William Colen
>            Assignee: William Colen
>            Priority: Minor
>             Fix For: tools-1.5.2-incubating
>
>
> The parameter -ngram is present on POS Tagger trainer tool, but it is not present on CV tool.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OPENNLP-231) POS Tagger cross validator tool is not evaluating models that includes ngram dictionaries.

Posted by "Joern Kottmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089363#comment-13089363 ] 

Joern Kottmann commented on OPENNLP-231:
----------------------------------------

You might want to create the ngram dictionary on a much larger text corpus, instead of the training data. Now its both possible, if I remember correctly Tom told me that didn't work well, and was more like an experiment, maybe we should validate this statement, and if it turn out to be true, it should be removed one day.

> POS Tagger cross validator tool is not evaluating models that includes ngram dictionaries.
> ------------------------------------------------------------------------------------------
>
>                 Key: OPENNLP-231
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-231
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Command Line Interface, POS Tagger
>    Affects Versions: tools-1.5.2-incubating
>            Reporter: William Colen
>            Assignee: William Colen
>            Priority: Minor
>             Fix For: tools-1.5.2-incubating
>
>
> The parameter -ngram is present on POS Tagger trainer tool, but it is not present on CV tool.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OPENNLP-231) POS Tagger cross validator tool is not evaluating models that includes ngram dictionaries.

Posted by "William Colen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089461#comment-13089461 ] 

William Colen commented on OPENNLP-231:
---------------------------------------

OK, so I will keep my changes and close it.
 I will improve constructors while working with the Monitors.

> POS Tagger cross validator tool is not evaluating models that includes ngram dictionaries.
> ------------------------------------------------------------------------------------------
>
>                 Key: OPENNLP-231
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-231
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Command Line Interface, POS Tagger
>    Affects Versions: tools-1.5.2-incubating
>            Reporter: William Colen
>            Assignee: William Colen
>            Priority: Minor
>             Fix For: tools-1.5.2-incubating
>
>
> The parameter -ngram is present on POS Tagger trainer tool, but it is not present on CV tool.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Closed] (OPENNLP-231) POS Tagger cross validator tool is not evaluating models that includes ngram dictionaries.

Posted by "William Colen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OPENNLP-231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

William Colen closed OPENNLP-231.
---------------------------------

    Resolution: Fixed

> POS Tagger cross validator tool is not evaluating models that includes ngram dictionaries.
> ------------------------------------------------------------------------------------------
>
>                 Key: OPENNLP-231
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-231
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Command Line Interface, POS Tagger
>    Affects Versions: tools-1.5.2-incubating
>            Reporter: William Colen
>            Assignee: William Colen
>            Priority: Minor
>             Fix For: tools-1.5.2-incubating
>
>
> The parameter -ngram is present on POS Tagger trainer tool, but it is not present on CV tool.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OPENNLP-231) POS Tagger cross validator tool is not evaluating models that includes ngram dictionaries.

Posted by "Joern Kottmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089464#comment-13089464 ] 

Joern Kottmann commented on OPENNLP-231:
----------------------------------------

+1, yes cose it.

> POS Tagger cross validator tool is not evaluating models that includes ngram dictionaries.
> ------------------------------------------------------------------------------------------
>
>                 Key: OPENNLP-231
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-231
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Command Line Interface, POS Tagger
>    Affects Versions: tools-1.5.2-incubating
>            Reporter: William Colen
>            Assignee: William Colen
>            Priority: Minor
>             Fix For: tools-1.5.2-incubating
>
>
> The parameter -ngram is present on POS Tagger trainer tool, but it is not present on CV tool.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OPENNLP-231) POS Tagger cross validator tool is not evaluating models that includes ngram dictionaries.

Posted by "William Colen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089262#comment-13089262 ] 

William Colen commented on OPENNLP-231:
---------------------------------------

Can please someone review if my changes make sense? Thank you

> POS Tagger cross validator tool is not evaluating models that includes ngram dictionaries.
> ------------------------------------------------------------------------------------------
>
>                 Key: OPENNLP-231
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-231
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Command Line Interface, POS Tagger
>    Affects Versions: tools-1.5.2-incubating
>            Reporter: William Colen
>            Assignee: William Colen
>            Priority: Minor
>             Fix For: tools-1.5.2-incubating
>
>
> The parameter -ngram is present on POS Tagger trainer tool, but it is not present on CV tool.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira