You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/03/22 12:26:04 UTC

[GitHub] [lucene] rmuir commented on a change in pull request #26: LUCENE-9853: Use CJKWidthCharFilter as the default character width normalizer in JapaneseAnalyzer

rmuir commented on a change in pull request #26:
URL: https://github.com/apache/lucene/pull/26#discussion_r598268124



##########
File path: lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseAnalyzer.java
##########
@@ -39,21 +41,28 @@
   private final Mode mode;
   private final Set<String> stoptags;
   private final UserDictionary userDict;
+  private final boolean charNormalization;
 
   public JapaneseAnalyzer() {
     this(
         null,
         JapaneseTokenizer.DEFAULT_MODE,
         DefaultSetHolder.DEFAULT_STOP_SET,
-        DefaultSetHolder.DEFAULT_STOP_TAGS);
+        DefaultSetHolder.DEFAULT_STOP_TAGS,
+        true);
   }
 
   public JapaneseAnalyzer(
-      UserDictionary userDict, Mode mode, CharArraySet stopwords, Set<String> stoptags) {
+      UserDictionary userDict,
+      Mode mode,
+      CharArraySet stopwords,
+      Set<String> stoptags,
+      boolean charNormalization) {

Review comment:
       I think this a bit confusing, if set to `false`, character normalization is still performed, just a different place in the chain. 
   
   Do we really need this parameter? I think it would be better to document it well in CHANGES.txt. If the user wants different behavior they can make a Analyzer from the different components very easily?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org