You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Rodrigo Agerri (JIRA)" <ji...@apache.org> on 2015/04/08 09:09:12 UTC
[jira] [Updated] (OPENNLP-760) probabilistic lemmatizer
[ https://issues.apache.org/jira/browse/OPENNLP-760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rodrigo Agerri updated OPENNLP-760:
-----------------------------------
Description:
Current SimpleLemmatizer is dictionary-based. A probabilistic lemmatizer works better for unknown words. There is already an open source tool which does this with good results:
https://code.google.com/p/mate-tools
The first paper describes the general idea and the
second presents the experiments in a realistic environment.
http://grzegorz.chrupala.me/papers/chrupala-2006/paper.pdf
http://grzegorz.chrupala.me/papers/chrupala-etal-2008a/paper.pdf
was:
Current SimpleLemmatizer is dictionary-based. A probabilistic lemmatizer works better for unknown words. There is already an open source tool which we could be based on to implement this into OpenNLP.
https://code.google.com/p/mate-tools
This the algorithm. The first paper describes the general idea and the
second presents the experiments in a realistic environment.
http://grzegorz.chrupala.me/papers/chrupala-2006/paper.pdf
http://grzegorz.chrupala.me/papers/chrupala-etal-2008a/paper.pdf
> probabilistic lemmatizer
> ------------------------
>
> Key: OPENNLP-760
> URL: https://issues.apache.org/jira/browse/OPENNLP-760
> Project: OpenNLP
> Issue Type: New Feature
> Components: Lemmatizer
> Reporter: Rodrigo Agerri
> Priority: Minor
>
> Current SimpleLemmatizer is dictionary-based. A probabilistic lemmatizer works better for unknown words. There is already an open source tool which does this with good results:
> https://code.google.com/p/mate-tools
> The first paper describes the general idea and the
> second presents the experiments in a realistic environment.
> http://grzegorz.chrupala.me/papers/chrupala-2006/paper.pdf
> http://grzegorz.chrupala.me/papers/chrupala-etal-2008a/paper.pdf
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)