You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by "Hayden Muhl (JIRA)" <ji...@apache.org> on 2015/07/10 01:52:04 UTC

[jira] [Created] (ATLAS-57) Allow Kuromoji to load a custom dictionary from the file system instead of the classpath

Hayden Muhl created ATLAS-57:
--------------------------------

             Summary: Allow Kuromoji to load a custom dictionary from the file system instead of the classpath
                 Key: ATLAS-57
                 URL: https://issues.apache.org/jira/browse/ATLAS-57
             Project: Atlas
          Issue Type: Improvement
            Reporter: Hayden Muhl


The JapaneseTokenizer currently loads its dictionary directly from the classpath. We create a custom dictionary with additional words, and in order to get the JapaneseTokenizer to pick up our custom dictionary, we have to repackage the WAR file (we currently use 4.10) to include our custom dictionary.

Proposed changes:

* Publish the Kuromoji tools as a JAR to Maven to make compiling a custom dictionary easier.
* Move loading of the dictionary out to the JapaneseTokenizerFactory, and pass the dictionary into the JapaneseTokenizer constructor.
* Add an argument to the JapaneseTokenizerFactory to load a dictionary off the file system. If this argument is not passed, then load the default dictionary from the classpath.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)