You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2020/03/01 13:25:02 UTC

[GitHub] [lucene-solr] msokolov commented on a change in pull request #1296: LUCENE-9253: Support custom dictionaries in KoreanTokenizer

msokolov commented on a change in pull request #1296: LUCENE-9253: Support custom dictionaries in KoreanTokenizer
URL: https://github.com/apache/lucene-solr/pull/1296#discussion_r386107769
 
 

 ##########
 File path: lucene/analysis/nori/src/java/org/apache/lucene/analysis/ko/KoreanTokenizer.java
 ##########
 @@ -185,16 +185,43 @@ public KoreanTokenizer(AttributeFactory factory, UserDictionary userDictionary,
    * @param discardPunctuation true if punctuation tokens should be dropped from the output.
    */
   public KoreanTokenizer(AttributeFactory factory, UserDictionary userDictionary, DecompoundMode mode, boolean outputUnknownUnigrams, boolean discardPunctuation) {
+    this(factory,
+        TokenInfoDictionary.getInstance(),
+        UnknownDictionary.getInstance(),
+        ConnectionCosts.getInstance(),
+        userDictionary, mode, outputUnknownUnigrams, discardPunctuation);
+  }
+
+  /**
+   * <p>Create a new KoreanTokenizer supplying a custom system dictionary and unknown dictionary.
+   * This constructor provides an entry point for users that want to construct custom language models
+   * that can be used as input to {@link org.apache.lucene.analysis.ko.util.DictionaryBuilder}.</p>
+   *
+   * @param factory the AttributeFactory to use
+   * @param systemDictionary a custom known token dictionary
+   * @param unkDictionary a custom unknown token dictionary
+   * @param connectionCosts custom token transition costs
+   * @param userDictionary Optional: if non-null, user dictionary.
+   * @param mode Decompound mode.
+   * @param outputUnknownUnigrams If true outputs unigrams for unknown words.
 
 Review comment:
   Don't capitalize "If"

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org