You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Martin Wiesner (Jira)" <ji...@apache.org> on 2023/02/26 15:41:00 UTC

[jira] [Comment Edited] (OPENNLP-1229) stem function giving wrong output

    [ https://issues.apache.org/jira/browse/OPENNLP-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17693659#comment-17693659 ] 

Martin Wiesner edited comment on OPENNLP-1229 at 2/26/23 3:40 PM:
------------------------------------------------------------------

Actually, "{*}thi{*}" is the expected outcome for "this" for every {{PorterStemmer}} implementation and thus not a real "bug". For context, see FAQ No. 3 at [https://tartarus.org/martin/PorterStemmer |https://tartarus.org/martin/PorterStemmer/].

You can cross check it with [NLTK|http://text-processing.com/demo/stem/]. It yields the same "thi" as stem of "this".

Hint: {{SnowballStemmer}} stems "this" -> "this".


was (Author: mawiesne):
Actually, "*thi*" is the expected outcome for "this" for every {{PorterStemmer}} implementation and thus not a real "bug". For context, see FAQ No. 3 at [https://tartarus.org/martin/PorterStemmer |https://tartarus.org/martin/PorterStemmer/].

You can cross-check this with [NLTK|http://text-processing.com/demo/stem/]. It yields the same "thi" as stem of "this".

Hint: {{SnowballStemmer}} stems "this" -> "this".

> stem function giving wrong output
> ---------------------------------
>
>                 Key: OPENNLP-1229
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-1229
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: Stemmer
>         Environment: Ubuntu-18.04, JDK-8
>            Reporter: Divya Rani
>            Assignee: Martin Wiesner
>            Priority: Minor
>
> As opennlp is using PorterStemmer for stemming PorterStemmer seems to be stemming "this" -> "thi".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)