You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "William Colen (JIRA)" <ji...@apache.org> on 2012/05/08 21:49:49 UTC

[jira] [Created] (OPENNLP-508) Add an option to create or expand a TagDictionary with training data

William Colen created OPENNLP-508:
-------------------------------------

             Summary: Add an option to create or expand a TagDictionary with training data
                 Key: OPENNLP-508
                 URL: https://issues.apache.org/jira/browse/OPENNLP-508
             Project: OpenNLP
          Issue Type: New Feature
          Components: POS Tagger
    Affects Versions: tools-1.5.3
            Reporter: William Colen
            Assignee: William Colen
             Fix For: tools-1.5.3


It would be useful if we could expand or create the TagDictionary while training a POS Tagger model.

I propose that we add a new command line argument, -tagDictCutoff, that would trigger the creation / expansion of the dictionary. The cutoff would represent the minimun number of occurrences that a word tag pair would occur in the training data before it is added to the dictionary. 

Further information can be found on this conversation: http://mail-archives.apache.org/mod_mbox/opennlp-dev/201205.mbox/%3CCA%2BiWThJNQzLSc3NmDLbEzaORDWnFgbk_id3SJjuELVRSoMTJzQ%40mail.gmail.com%3E

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Closed] (OPENNLP-508) Add an option to create or expand a TagDictionary with training data

Posted by "William Colen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OPENNLP-508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

William Colen closed OPENNLP-508.
---------------------------------

    Resolution: Fixed

Now we can optionally create the TagDictionary using the training data. If performing cross-validation, it will add only training data to the dictionary.
                
> Add an option to create or expand a TagDictionary with training data
> --------------------------------------------------------------------
>
>                 Key: OPENNLP-508
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-508
>             Project: OpenNLP
>          Issue Type: New Feature
>          Components: POS Tagger
>    Affects Versions: tools-1.5.3
>            Reporter: William Colen
>            Assignee: William Colen
>             Fix For: tools-1.5.3
>
>
> It would be useful if we could expand or create the TagDictionary while training a POS Tagger model.
> I propose that we add a new command line argument, -tagDictCutoff, that would trigger the creation / expansion of the dictionary. The cutoff would represent the minimun number of occurrences that a word tag pair would occur in the training data before it is added to the dictionary. 
> Further information can be found on this conversation: http://mail-archives.apache.org/mod_mbox/opennlp-dev/201205.mbox/%3CCA%2BiWThJNQzLSc3NmDLbEzaORDWnFgbk_id3SJjuELVRSoMTJzQ%40mail.gmail.com%3E

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira