You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/02/09 12:04:36 UTC

[GitHub] [lucene] romseygeek opened a new pull request #665: LUCENE-10413: Make default Ukrainian stopword set available

romseygeek opened a new pull request #665:
URL: https://github.com/apache/lucene/pull/665


   This commit adds a new `getDefaultStopwords()` static method to 
   UkrainianMorfologikAnalyzer, which makes it possible to create an
   analyzer with the default stop word set but a custom stem exclusion
   set.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] romseygeek merged pull request #665: LUCENE-10413: Make default Ukrainian stopword set available

Posted by GitBox <gi...@apache.org>.
romseygeek merged pull request #665:
URL: https://github.com/apache/lucene/pull/665


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] romseygeek commented on a change in pull request #665: LUCENE-10413: Make default Ukrainian stopword set available

Posted by GitBox <gi...@apache.org>.
romseygeek commented on a change in pull request #665:
URL: https://github.com/apache/lucene/pull/665#discussion_r802706506



##########
File path: lucene/analysis/morfologik/src/java/org/apache/lucene/analysis/uk/UkrainianMorfologikAnalyzer.java
##########
@@ -113,14 +113,11 @@ private static DefaultResources getDefaultResources() {
     return defaultResources;
   }
 
-  private static class DefaultResources {
-    final CharArraySet stopSet;
-    final Dictionary dictionary;
+  private record DefaultResources(CharArraySet stopSet, Dictionary dictionary) {}
 
-    private DefaultResources(CharArraySet stopSet, Dictionary dictionary) {
-      this.stopSet = stopSet;
-      this.dictionary = dictionary;
-    }
+  /** Returns the default stopword set for this analyzer */
+  public static CharArraySet getDefaultStopwords() {
+    return getDefaultResources().stopSet;

Review comment:
       oh good call, I've wrapped it in `CharArraySet.unmodifiableSet()`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mayya-sharipova commented on a change in pull request #665: LUCENE-10413: Make default Ukrainian stopword set available

Posted by GitBox <gi...@apache.org>.
mayya-sharipova commented on a change in pull request #665:
URL: https://github.com/apache/lucene/pull/665#discussion_r802636626



##########
File path: lucene/analysis/morfologik/src/java/org/apache/lucene/analysis/uk/UkrainianMorfologikAnalyzer.java
##########
@@ -113,14 +113,11 @@ private static DefaultResources getDefaultResources() {
     return defaultResources;
   }
 
-  private static class DefaultResources {
-    final CharArraySet stopSet;
-    final Dictionary dictionary;
+  private record DefaultResources(CharArraySet stopSet, Dictionary dictionary) {}

Review comment:
       nice, records!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] dweiss commented on a change in pull request #665: LUCENE-10413: Make default Ukrainian stopword set available

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #665:
URL: https://github.com/apache/lucene/pull/665#discussion_r802644902



##########
File path: lucene/analysis/morfologik/src/java/org/apache/lucene/analysis/uk/UkrainianMorfologikAnalyzer.java
##########
@@ -113,14 +113,11 @@ private static DefaultResources getDefaultResources() {
     return defaultResources;
   }
 
-  private static class DefaultResources {
-    final CharArraySet stopSet;
-    final Dictionary dictionary;
+  private record DefaultResources(CharArraySet stopSet, Dictionary dictionary) {}
 
-    private DefaultResources(CharArraySet stopSet, Dictionary dictionary) {
-      this.stopSet = stopSet;
-      this.dictionary = dictionary;
-    }
+  /** Returns the default stopword set for this analyzer */
+  public static CharArraySet getDefaultStopwords() {
+    return getDefaultResources().stopSet;

Review comment:
       I'd return a copy of the default source set instead of its reference - this makes any accidental changes to the set a no-op on the defaults?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org