You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Uihyun Kim (Jira)" <ji...@apache.org> on 2022/02/09 15:38:00 UTC
[jira] [Created] (LUCENE-10416) Update Korean Dictionary for Nori
Uihyun Kim created LUCENE-10416:
-----------------------------------
Summary: Update Korean Dictionary for Nori
Key: LUCENE-10416
URL: https://issues.apache.org/jira/browse/LUCENE-10416
Project: Lucene - Core
Issue Type: Improvement
Components: modules/analysis
Reporter: Uihyun Kim
For Nori - Korean analyzer, there is Korean dictionary named mecab-ko-dic, which is available under an Apache license here: [https://bitbucket.org/eunjeon/mecab-ko-dic]
The dictionary hasn't been updated in Nori although it has some updates to provide better analysis results. Downloading is available here: [https://bitbucket.org/eunjeon/mecab-ko-dic/downloads]
* Currently used in Nori: [https://bitbucket.org/eunjeon/mecab-ko-dic/downloads/mecab-ko-dic-2.0.3-20170922.tar.gz]
* Latest: [https://bitbucket.org/eunjeon/mecab-ko-dic/downloads/mecab-ko-dic-2.0.3-20170922.tar.gz]
There are changes between the currently used version and the latest release version(change log: [https://bitbucket.org/eunjeon/mecab-ko-dic/src/master/CHANGES.md])
* New feature: added semantic class for NNG - 장소, 행위, 상태변화, 정적상태
* Fix: correct unexpectedly huge cost on NNG/장소
* New words
There's no issue with testing :lucene:analysis:nori:test and building a new binary.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org