You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/01/19 08:33:00 UTC
[jira] [Commented] (LUCENE-9676) Hunspell: improve stemming of
all-caps words
[ https://issues.apache.org/jira/browse/LUCENE-9676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267742#comment-17267742 ]
ASF subversion and git services commented on LUCENE-9676:
---------------------------------------------------------
Commit 422c89baefda8e8e8a9a3f9b761785895b5e9bb5 in lucene-solr's branch refs/heads/master from Peter Gromov
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=422c89b ]
LUCENE-9676: Hunspell: improve stemming of all-caps words (#2217)
Hunspell: improve stemming of all-caps words
Repeat Hunspell's logic:
* when encountering a mixed- or (inflectable) all-case dictionary entry, add its title-case analog as a hidden entry
* use that hidden entry for stemming case variants for title- and uppercase words, but don't consider it a valid word itself
* ...unless there's another explicit dictionary entry of that title case
> Hunspell: improve stemming of all-caps words
> --------------------------------------------
>
> Key: LUCENE-9676
> URL: https://issues.apache.org/jira/browse/LUCENE-9676
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Peter Gromov
> Assignee: Dawid Weiss
> Priority: Major
> Fix For: master (9.0)
>
> Time Spent: 40m
> Remaining Estimate: 0h
>
> Currently words like "OPENOFFICE.ORG" result in no stems even if the dictionary contains "OpenOffice.org"
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org