You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2020/05/01 10:44:22 UTC
[GitHub] [lucene-solr] filcius opened a new pull request #1474: LUCENE-9354: Sync French stop words with latest version from Snowball.
filcius opened a new pull request #1474:
URL: https://github.com/apache/lucene-solr/pull/1474
This new version removed some French homonyms from the list
# Description
Sync French stop words with latest version from Snowball.
This new version removed some French homonyms from the list
# Tests
None, I am a French native speaker and reviewed the stop words myself.
# Checklist
- [x] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability.
- [x] I have created a Jira issue and added the issue ID to my pull request title.
- [x] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended)
- [x] I have developed this patch against the `master` branch.
- [ ] I have run `ant precommit` and the appropriate test suite.
- [ ] I have added tests for my changes.
- [x] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] filcius commented on pull request #1474: LUCENE-9354: Sync French stop words with latest version from Snowball.
Posted by GitBox <gi...@apache.org>.
filcius commented on pull request #1474:
URL: https://github.com/apache/lucene-solr/pull/1474#issuecomment-622358775
Ok, I also updated snowball-website to take the latest master commit.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] rmuir commented on pull request #1474: LUCENE-9354: Sync French stop words with latest version from Snowball.
Posted by GitBox <gi...@apache.org>.
rmuir commented on pull request #1474:
URL: https://github.com/apache/lucene-solr/pull/1474#issuecomment-622354621
We shouldn't manually modify these files, but instead update snowball to bring in these stopword changes.
We have to bump these commit hashes: https://github.com/apache/lucene-solr/blob/master/gradle/generation/snowball.gradle#L31-L36
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] rmuir commented on pull request #1474: LUCENE-9354: Sync French stop words with latest version from Snowball.
Posted by GitBox <gi...@apache.org>.
rmuir commented on pull request #1474:
URL: https://github.com/apache/lucene-solr/pull/1474#issuecomment-622362638
If you can give me permissions, I can push the stuff to your branch and then we can merge the PR?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] rmuir commented on pull request #1474: LUCENE-9354: Sync French stop words with latest version from Snowball.
Posted by GitBox <gi...@apache.org>.
rmuir commented on pull request #1474:
URL: https://github.com/apache/lucene-solr/pull/1474#issuecomment-622361321
Yeah, we need to run `./gradlew snowball` after bumping the commit to regenerate such files. Otherwise there are problems caused by doing it manually. In this case a whitespace difference between the commit here and the actual file.
```
think:lucene-solr[newFrenchStopWords]$ ./gradlew snowball
To honour the JVM settings for this build a new JVM will be forked. Please consider using the daemon: https://docs.gradle.org/6.0.1/userguide/gradle_daemon.html.
Daemon will be stopped at the end of the build stopping after processing
> Task :buildSrc:compileJava
> Task :buildSrc:compileGroovy NO-SOURCE
> Task :buildSrc:processResources NO-SOURCE
> Task :buildSrc:classes
> Task :buildSrc:jar
> Task :buildSrc:assemble
> Task :buildSrc:compileTestJava NO-SOURCE
> Task :buildSrc:compileTestGroovy NO-SOURCE
> Task :buildSrc:processTestResources NO-SOURCE
> Task :buildSrc:testClasses UP-TO-DATE
> Task :buildSrc:test NO-SOURCE
> Task :buildSrc:check UP-TO-DATE
> Task :buildSrc:build
> Task :lucene:analysis:common:downloadSnowballData
> Task :lucene:analysis:common:downloadSnowballStemmers
> Task :lucene:analysis:common:downloadSnowballWebsite
Download https://github.com/snowballstem/snowball-website/archive/5a8cf2451d108217585d8e32d744f8b8fd20c711.zip
> Task :lucene:analysis:common:snowballGen
cc -O2 -W -Wall -Wmissing-prototypes -Wmissing-declarations -Iinclude -c -o compiler/generator_java.o compiler/generator_java.c
cc -O2 -W -Wall -Wmissing-prototypes -Wmissing-declarations -o snowball compiler/space.o compiler/tokeniser.o compiler/analyser.o compiler/generator.o compiler/driver.o compiler/generator_csharp.o compiler/generator_java.o compiler/generator_js.o compiler/generator_pascal.o compiler/generator_python.o compiler/generator_rust.o compiler/generator_go.o
./snowball algorithms/arabic.sbl -j -o java/org/tartarus/snowball/ext/arabicStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/armenian.sbl -j -o java/org/tartarus/snowball/ext/armenianStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/basque.sbl -j -o java/org/tartarus/snowball/ext/basqueStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/catalan.sbl -j -o java/org/tartarus/snowball/ext/catalanStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/danish.sbl -j -o java/org/tartarus/snowball/ext/danishStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/dutch.sbl -j -o java/org/tartarus/snowball/ext/dutchStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/english.sbl -j -o java/org/tartarus/snowball/ext/englishStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/estonian.sbl -j -o java/org/tartarus/snowball/ext/estonianStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/finnish.sbl -j -o java/org/tartarus/snowball/ext/finnishStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/french.sbl -j -o java/org/tartarus/snowball/ext/frenchStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/german.sbl -j -o java/org/tartarus/snowball/ext/germanStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/german2.sbl -j -o java/org/tartarus/snowball/ext/german2Stemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/greek.sbl -j -o java/org/tartarus/snowball/ext/greekStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/hindi.sbl -j -o java/org/tartarus/snowball/ext/hindiStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/hungarian.sbl -j -o java/org/tartarus/snowball/ext/hungarianStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/indonesian.sbl -j -o java/org/tartarus/snowball/ext/indonesianStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/irish.sbl -j -o java/org/tartarus/snowball/ext/irishStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/italian.sbl -j -o java/org/tartarus/snowball/ext/italianStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/kraaij_pohlmann.sbl -j -o java/org/tartarus/snowball/ext/kraaij_pohlmannStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/lithuanian.sbl -j -o java/org/tartarus/snowball/ext/lithuanianStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/lovins.sbl -j -o java/org/tartarus/snowball/ext/lovinsStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/nepali.sbl -j -o java/org/tartarus/snowball/ext/nepaliStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/norwegian.sbl -j -o java/org/tartarus/snowball/ext/norwegianStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/porter.sbl -j -o java/org/tartarus/snowball/ext/porterStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/portuguese.sbl -j -o java/org/tartarus/snowball/ext/portugueseStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/romanian.sbl -j -o java/org/tartarus/snowball/ext/romanianStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/russian.sbl -j -o java/org/tartarus/snowball/ext/russianStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/serbian.sbl -j -o java/org/tartarus/snowball/ext/serbianStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/spanish.sbl -j -o java/org/tartarus/snowball/ext/spanishStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/swedish.sbl -j -o java/org/tartarus/snowball/ext/swedishStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/tamil.sbl -j -o java/org/tartarus/snowball/ext/tamilStemmer -p org.tartarus.snowball.SnowballStemmer
./snowball algorithms/turkish.sbl -j -o java/org/tartarus/snowball/ext/turkishStemmer -p org.tartarus.snowball.SnowballStemmer
destname=libstemmer_java; \
dest=dist/${destname}; \
rm -rf ${dest} && \
rm -f ${dest}.tgz && \
mkdir -p ${dest} && \
cp -a doc/libstemmer_java_README ${dest}/README && \
mkdir -p ${dest}/java/org/tartarus/snowball/ext && \
cp -a java/org/tartarus/snowball/ext/arabicStemmer.java java/org/tartarus/snowball/ext/armenianStemmer.java java/org/tartarus/snowball/ext/basqueStemmer.java java/org/tartarus/snowball/ext/catalanStemmer.java java/org/tartarus/snowball/ext/danishStemmer.java java/org/tartarus/snowball/ext/dutchStemmer.java java/org/tartarus/snowball/ext/englishStemmer.java java/org/tartarus/snowball/ext/estonianStemmer.java java/org/tartarus/snowball/ext/finnishStemmer.java java/org/tartarus/snowball/ext/frenchStemmer.java java/org/tartarus/snowball/ext/germanStemmer.java java/org/tartarus/snowball/ext/german2Stemmer.java java/org/tartarus/snowball/ext/greekStemmer.java java/org/tartarus/snowball/ext/hindiStemmer.java java/org/tartarus/snowball/ext/hungarianStemmer.java java/org/tartarus/snowball/ext/indonesianStemmer.java java/org/tartarus/snowball/ext/irishStemmer.java java/org/tartarus/snowball/ext/italianStemmer.java java/org/tartarus/snowball/ext/kraaij_pohlmannStemmer.java java/org/tartarus/snowball/ext/lithuanianStemmer.java java/org/tartarus/snowball/ext/lovinsStemmer.java java/org/tartarus/snowball/ext/nepaliStemmer.java java/org/tartarus/snowball/ext/norwegianStemmer.java java/org/tartarus/snowball/ext/porterStemmer.java java/org/tartarus/snowball/ext/portugueseStemmer.java java/org/tartarus/snowball/ext/romanianStemmer.java java/org/tartarus/snowball/ext/russianStemmer.java java/org/tartarus/snowball/ext/serbianStemmer.java java/org/tartarus/snowball/ext/spanishStemmer.java java/org/tartarus/snowball/ext/swedishStemmer.java java/org/tartarus/snowball/ext/tamilStemmer.java java/org/tartarus/snowball/ext/turkishStemmer.java ${dest}/java/org/tartarus/snowball/ext && \
mkdir -p ${dest}/java/org/tartarus/snowball && \
cp -a java/org/tartarus/snowball/Among.java java/org/tartarus/snowball/SnowballProgram.java java/org/tartarus/snowball/SnowballStemmer.java java/org/tartarus/snowball/TestApp.java ${dest}/java/org/tartarus/snowball && \
cp -a COPYING NEWS ${dest} && \
(cd ${dest} && \
echo "README" >> MANIFEST && \
ls java/org/tartarus/snowball/ext/*.java >> MANIFEST && \
ls java/org/tartarus/snowball/*.java >> MANIFEST) && \
(cd dist && tar zcf ${destname}.tgz ${destname}) && \
rm -rf ${dest}
> Task :snowball
BUILD SUCCESSFUL in 18s
4 actionable tasks: 4 executed
<-------------> 0% WAITING
think:lucene-solr[newFrenchStopWords]$ git status
On branch newFrenchStopWords
Your branch is up to date with 'camellia/newFrenchStopWords'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: lucene/analysis/common/src/resources/org/apache/lucene/analysis/snowball/french_stop.txt
no changes added to commit (use "git add" and/or "git commit -a")
think:lucene-solr[newFrenchStopWords]$ git diff
diff --git a/lucene/analysis/common/src/resources/org/apache/lucene/analysis/snowball/french_stop.txt b/lucene/analysis/common/src/resources/org/apache/lucene/analysis/snowball/french_stop.txt
index 82404502b43..658ae9c91ac 100644
--- a/lucene/analysis/common/src/resources/org/apache/lucene/analysis/snowball/french_stop.txt
+++ b/lucene/analysis/common/src/resources/org/apache/lucene/analysis/snowball/french_stop.txt
@@ -183,3 +183,4 @@ quelle | which
quelles | which
sans | without
soi | oneself
+
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] filcius commented on pull request #1474: LUCENE-9354: Sync French stop words with latest version from Snowball.
Posted by GitBox <gi...@apache.org>.
filcius commented on pull request #1474:
URL: https://github.com/apache/lucene-solr/pull/1474#issuecomment-622359248
I did not run gradle or compile anything here. I don't have that kind of setup here. If it is required, I propose someone else commit to my branch and continue my work.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] filcius commented on pull request #1474: LUCENE-9354: Sync French stop words with latest version from Snowball.
Posted by GitBox <gi...@apache.org>.
filcius commented on pull request #1474:
URL: https://github.com/apache/lucene-solr/pull/1474#issuecomment-622397809
Looks good, thanks @rmuir
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene-solr] filcius commented on pull request #1474: LUCENE-9354: Sync French stop words with latest version from Snowball.
Posted by GitBox <gi...@apache.org>.
filcius commented on pull request #1474:
URL: https://github.com/apache/lucene-solr/pull/1474#issuecomment-622360810
I dont see any other changes in snowball-website, so this may be fine https://github.com/snowballstem/snowball-website/compare/ff891e74f08e7315523ee3c0cad55bb1b7831b9d...master
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org