You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "William Colen (JIRA)" <ji...@apache.org> on 2011/07/22 18:37:58 UTC

[jira] [Created] (OPENNLP-237) Add abbreviation dictionary support to Tokenizer

Add abbreviation dictionary support to Tokenizer
------------------------------------------------

                 Key: OPENNLP-237
                 URL: https://issues.apache.org/jira/browse/OPENNLP-237
             Project: OpenNLP
          Issue Type: Improvement
          Components: Tokenizer
    Affects Versions: tools-1.5.2-incubating
            Reporter: William Colen
            Assignee: William Colen
            Priority: Minor
             Fix For: tools-1.5.2-incubating


The Tokenizer component can take advantage of using an abbreviation dictionary in context generator.
Although it modifies the default tokenizer context generator it won't break compatibility with old models because the features would be applied only if the dictionary is present.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Closed] (OPENNLP-237) Add abbreviation dictionary support to Tokenizer

Posted by "William Colen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OPENNLP-237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

William Colen closed OPENNLP-237.
---------------------------------

    Resolution: Fixed

> Add abbreviation dictionary support to Tokenizer
> ------------------------------------------------
>
>                 Key: OPENNLP-237
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-237
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Tokenizer
>    Affects Versions: tools-1.5.2-incubating
>            Reporter: William Colen
>            Assignee: William Colen
>            Priority: Minor
>             Fix For: tools-1.5.2-incubating
>
>
> The Tokenizer component can take advantage of using an abbreviation dictionary in context generator.
> Although it modifies the default tokenizer context generator it won't break compatibility with old models because the features would be applied only if the dictionary is present.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira