You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hivemall.apache.org by "Takuya Kitazawa (JIRA)" <ji...@apache.org> on 2017/07/07 03:33:01 UTC
[jira] [Created] (HIVEMALL-130) Support user-defined dictionary for
`tokenize_ja`
Takuya Kitazawa created HIVEMALL-130:
----------------------------------------
Summary: Support user-defined dictionary for `tokenize_ja`
Key: HIVEMALL-130
URL: https://issues.apache.org/jira/browse/HIVEMALL-130
Project: Hivemall
Issue Type: Improvement
Reporter: Takuya Kitazawa
Assignee: Takuya Kitazawa
Support another argument "userDict". Type would be List<String>, and each element defines a new word in the following format: <word>,<result>,<read>,<class> https://github.com/atilika/kuromoji/blob/d0700ab6dd489aaf0fcb1e4e78ce2f682be9f255/kuromoji-core/src/test/resources/userdict.txt
Ref (Japanese): http://d.hatena.ne.jp/Kazuhira/20130616/1371390716
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)