You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Trey Jones (JIRA)" <ji...@apache.org> on 2018/07/20 14:33:00 UTC
[jira] [Created] (LUCENE-8416) Add tokenized version of o.o. to
Stempel stopwords
Trey Jones created LUCENE-8416:
----------------------------------
Summary: Add tokenized version of o.o. to Stempel stopwords
Key: LUCENE-8416
URL: https://issues.apache.org/jira/browse/LUCENE-8416
Project: Lucene - Core
Issue Type: Improvement
Components: modules/analysis
Reporter: Trey Jones
The Stempel stopword list ( lucene-solr/lucene/analysis/stempel/src/resources/org/apache/lucene/analysis/pl/stopwords.txt ) contains "o.o." which is a good stopword (it's part of the abbreviation for "limited liability company", which is "[sp. z o.o.|https://en.wiktionary.org/wiki/sp._z_o.o.]". However, the standard tokenizer changes "o.o." to "o.o" so the stopword filter has no effect.
Add "o.o" to the stopword list. (It's probably okay to leave "o.o." in the list, though, in case a different tokenizer is used.)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org