You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Tommaso Teofili (JIRA)" <ji...@apache.org> on 2015/08/07 10:37:45 UTC

[jira] [Commented] (OPENNLP-659) Language models

    [ https://issues.apache.org/jira/browse/OPENNLP-659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14661508#comment-14661508 ] 

Tommaso Teofili commented on OPENNLP-659:
-----------------------------------------

I've pushed the last patch to a separate branch of opennlp [in my github fork|https://github.com/tteofili/opennlp/tree/opennlp-741], I'll create and attach the final patch once done (most work to be done is on evaluation).

> Language models
> ---------------
>
>                 Key: OPENNLP-659
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-659
>             Project: OpenNLP
>          Issue Type: New Feature
>    Affects Versions: tools-1.5.3
>         Environment: all
>            Reporter: Martin Wunderlich
>            Assignee: Tommaso Teofili
>            Priority: Minor
>              Labels: features, language, model
>         Attachments: OPENNLP-659.0.patch, OPENNLP-659.1.patch
>
>   Original Estimate: 7m
>  Remaining Estimate: 7m
>
> This feature request is for inclusion of n-gramm language models in OpenNLP. The language models could either be preconstructed from existing corpora for various languages or they could be built by the user based on sample texts. There should be unigram, bigram and trigram LMs at least, with absolute and relative frequencies for each n-gram. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)