You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Tomoko Uchida (JIRA)" <ji...@apache.org> on 2019/04/13 12:37:00 UTC
[jira] [Resolved] (LUCENE-8752) Apply a patch to kuromoji dictionary to properly handle Japanese new era '令和' (REIWA)
[ https://issues.apache.org/jira/browse/LUCENE-8752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tomoko Uchida resolved LUCENE-8752.
-----------------------------------
Resolution: Fixed
Fix Version/s: master (9.0)
8.1
> Apply a patch to kuromoji dictionary to properly handle Japanese new era '令和' (REIWA)
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-8752
> URL: https://issues.apache.org/jira/browse/LUCENE-8752
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/analysis
> Reporter: Tomoko Uchida
> Assignee: Tomoko Uchida
> Priority: Minor
> Fix For: 8.1, master (9.0)
>
> Attachments: LUCENE-8752.patch
>
>
> As of May 1st, 2019, Japanese era '元号' (Gengo) will be set to '令和' (Reiwa). See this article for more details:
> [https://www.bbc.com/news/world-asia-47769566]
> Currently '令和' is splitted up to '令' and '和' by {{JapaneseTokenizer}}. It should be tokenized as one word so that Japanese texts including era names are searched as users expect. Because the default Kuromoji dictionary (mecab-ipadic) has not been maintained since 2007, a one-line patch to the source CSV file is needed for this era change.
> Era name is used in many official or formal documents in Japan, so it would be desirable the search systems properly handle this without adding a user dictionary or using phrase query. :)
> FYI, JDK DateTime API will support the new era (in the next updates.)
> [https://blogs.oracle.com/java-platform-group/a-new-japanese-era-for-java]
> The patch is available here:
> [https://github.com/apache/lucene-solr/pull/632]
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org