You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Ben Kazez (Jira)" <ji...@apache.org> on 2020/06/18 14:54:00 UTC
[jira] [Created] (LUCENE-9410) German/French stemmers fail for common forms maux, gegrüßt, grüßend, schlummert
Ben Kazez created LUCENE-9410:
---------------------------------
Summary: German/French stemmers fail for common forms maux, gegrüßt, grüßend, schlummert
Key: LUCENE-9410
URL: https://issues.apache.org/jira/browse/LUCENE-9410
Project: Lucene - Core
Issue Type: Bug
Components: modules/analysis
Affects Versions: 8.5
Environment: Elasticsearch 7.7.1 running on cloud.elastic.co
Reporter: Ben Kazez
I'm using Lucene via Elasticsearch 7.7.1 and have run into an issue where German and French stemming (either via the Snowball analyzer, or the "light" or "heavy" stemming analyzers) fails to identify some common forms:
- French:
- "maux" should match "mal" ("maux" is plural of "mal") but instead "maux" is unchanged
- German:
- "schlummert" should match "schlummern" (infinitive) but instead is unchanged
- "grüßend" should match "grüßen" (infinitive) but instead yields "grussend"
- "gegrüßt" should match "grüßen" (infinitive) but instead yields "gegrusst"
The folks from Elasticsearch said I should file a bug with Lucene: https://discuss.elastic.co/t/better-french-and-german-stemming/236283
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org