You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/03/22 14:36:34 UTC

[GitHub] [lucene] mocobeta commented on a change in pull request #26: LUCENE-9853: Use CJKWidthCharFilter as the default character width normalizer in JapaneseAnalyzer

mocobeta commented on a change in pull request #26:
URL: https://github.com/apache/lucene/pull/26#discussion_r598772602



##########
File path: lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseAnalyzer.java
##########
@@ -39,21 +41,28 @@
   private final Mode mode;
   private final Set<String> stoptags;
   private final UserDictionary userDict;
+  private final boolean charNormalization;
 
   public JapaneseAnalyzer() {
     this(
         null,
         JapaneseTokenizer.DEFAULT_MODE,
         DefaultSetHolder.DEFAULT_STOP_SET,
-        DefaultSetHolder.DEFAULT_STOP_TAGS);
+        DefaultSetHolder.DEFAULT_STOP_TAGS,
+        true);
   }
 
   public JapaneseAnalyzer(
-      UserDictionary userDict, Mode mode, CharArraySet stopwords, Set<String> stoptags) {
+      UserDictionary userDict,
+      Mode mode,
+      CharArraySet stopwords,
+      Set<String> stoptags,
+      boolean charNormalization) {

Review comment:
       I thought it'd be better to provide this option for backward compatibility. But yes, users can easily switch to their own custom analyzer if they want. I'll remove the parameter; and add some documentation (MIGRATE entry) to switch back to the old behaviour.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org