You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Martin Wiesner (Jira)" <ji...@apache.org> on 2023/02/26 15:31:00 UTC

[jira] [Commented] (OPENNLP-1229) stem function giving wrong output

    [ https://issues.apache.org/jira/browse/OPENNLP-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17693659#comment-17693659 ] 

Martin Wiesner commented on OPENNLP-1229:
-----------------------------------------

Actually, "*thi*" is the expected outcome for "this" for every {{PorterStemmer}} implementation and thus not a real "bug". For context, see FAQ No. 3 at [https://tartarus.org/martin/PorterStemmer |https://tartarus.org/martin/PorterStemmer/].

You can cross-check this with [NLTK|http://text-processing.com/demo/stem/]. It yields the same "thi" as stem of "this".

Hint: {{SnowballStemmer}} stems "this" -> "this".

> stem function giving wrong output
> ---------------------------------
>
>                 Key: OPENNLP-1229
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-1229
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: Stemmer
>         Environment: Ubuntu-18.04, JDK-8
>            Reporter: Divya Rani
>            Priority: Minor
>
> As opennlp is using PorterStemmer for stemming PorterStemmer seems to be stemming "this" -> "thi".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)