You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/02/09 12:04:36 UTC
[GitHub] [lucene] romseygeek opened a new pull request #665: LUCENE-10413: Make default Ukrainian stopword set available
romseygeek opened a new pull request #665:
URL: https://github.com/apache/lucene/pull/665
This commit adds a new `getDefaultStopwords()` static method to
UkrainianMorfologikAnalyzer, which makes it possible to create an
analyzer with the default stop word set but a custom stem exclusion
set.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene] romseygeek merged pull request #665: LUCENE-10413: Make default Ukrainian stopword set available
Posted by GitBox <gi...@apache.org>.
romseygeek merged pull request #665:
URL: https://github.com/apache/lucene/pull/665
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene] romseygeek commented on a change in pull request #665: LUCENE-10413: Make default Ukrainian stopword set available
Posted by GitBox <gi...@apache.org>.
romseygeek commented on a change in pull request #665:
URL: https://github.com/apache/lucene/pull/665#discussion_r802706506
##########
File path: lucene/analysis/morfologik/src/java/org/apache/lucene/analysis/uk/UkrainianMorfologikAnalyzer.java
##########
@@ -113,14 +113,11 @@ private static DefaultResources getDefaultResources() {
return defaultResources;
}
- private static class DefaultResources {
- final CharArraySet stopSet;
- final Dictionary dictionary;
+ private record DefaultResources(CharArraySet stopSet, Dictionary dictionary) {}
- private DefaultResources(CharArraySet stopSet, Dictionary dictionary) {
- this.stopSet = stopSet;
- this.dictionary = dictionary;
- }
+ /** Returns the default stopword set for this analyzer */
+ public static CharArraySet getDefaultStopwords() {
+ return getDefaultResources().stopSet;
Review comment:
oh good call, I've wrapped it in `CharArraySet.unmodifiableSet()`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene] mayya-sharipova commented on a change in pull request #665: LUCENE-10413: Make default Ukrainian stopword set available
Posted by GitBox <gi...@apache.org>.
mayya-sharipova commented on a change in pull request #665:
URL: https://github.com/apache/lucene/pull/665#discussion_r802636626
##########
File path: lucene/analysis/morfologik/src/java/org/apache/lucene/analysis/uk/UkrainianMorfologikAnalyzer.java
##########
@@ -113,14 +113,11 @@ private static DefaultResources getDefaultResources() {
return defaultResources;
}
- private static class DefaultResources {
- final CharArraySet stopSet;
- final Dictionary dictionary;
+ private record DefaultResources(CharArraySet stopSet, Dictionary dictionary) {}
Review comment:
nice, records!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene] dweiss commented on a change in pull request #665: LUCENE-10413: Make default Ukrainian stopword set available
Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #665:
URL: https://github.com/apache/lucene/pull/665#discussion_r802644902
##########
File path: lucene/analysis/morfologik/src/java/org/apache/lucene/analysis/uk/UkrainianMorfologikAnalyzer.java
##########
@@ -113,14 +113,11 @@ private static DefaultResources getDefaultResources() {
return defaultResources;
}
- private static class DefaultResources {
- final CharArraySet stopSet;
- final Dictionary dictionary;
+ private record DefaultResources(CharArraySet stopSet, Dictionary dictionary) {}
- private DefaultResources(CharArraySet stopSet, Dictionary dictionary) {
- this.stopSet = stopSet;
- this.dictionary = dictionary;
- }
+ /** Returns the default stopword set for this analyzer */
+ public static CharArraySet getDefaultStopwords() {
+ return getDefaultResources().stopSet;
Review comment:
I'd return a copy of the default source set instead of its reference - this makes any accidental changes to the set a no-op on the defaults?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org