You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/01/19 08:33:00 UTC

[jira] [Commented] (LUCENE-9676) Hunspell: improve stemming of all-caps words

    [ https://issues.apache.org/jira/browse/LUCENE-9676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267742#comment-17267742 ] 

ASF subversion and git services commented on LUCENE-9676:
---------------------------------------------------------

Commit 422c89baefda8e8e8a9a3f9b761785895b5e9bb5 in lucene-solr's branch refs/heads/master from Peter Gromov
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=422c89b ]

LUCENE-9676: Hunspell: improve stemming of all-caps words (#2217)

Hunspell: improve stemming of all-caps words

Repeat Hunspell's logic:
* when encountering a mixed- or (inflectable) all-case dictionary entry, add its title-case analog as a hidden entry
* use that hidden entry for stemming case variants for title- and uppercase words, but don't consider it a valid word itself
* ...unless there's another explicit dictionary entry of that title case


> Hunspell: improve stemming of all-caps words
> --------------------------------------------
>
>                 Key: LUCENE-9676
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9676
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Peter Gromov
>            Assignee: Dawid Weiss
>            Priority: Major
>             Fix For: master (9.0)
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently words like "OPENOFFICE.ORG" result in no stems even if the dictionary contains "OpenOffice.org"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org