You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2020/05/01 10:44:22 UTC

[GitHub] [lucene-solr] filcius opened a new pull request #1474: LUCENE-9354: Sync French stop words with latest version from Snowball.

filcius opened a new pull request #1474:
URL: https://github.com/apache/lucene-solr/pull/1474


   This new version removed some French homonyms from the list
   
   
   # Description
   
   Sync French stop words with latest version from Snowball.
   
   This new version removed some French homonyms from the list
   
   # Tests
   
   None, I am a French native speaker and reviewed the stop words myself.
   
   # Checklist
   
   
   - [x] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request title.
   - [x] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `master` branch.
   - [ ] I have run `ant precommit` and the appropriate test suite.
   - [ ] I have added tests for my changes.
   - [x] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] filcius commented on pull request #1474: LUCENE-9354: Sync French stop words with latest version from Snowball.

Posted by GitBox <gi...@apache.org>.
filcius commented on pull request #1474:
URL: https://github.com/apache/lucene-solr/pull/1474#issuecomment-622358775


   Ok, I also updated snowball-website to take the latest master commit.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] rmuir commented on pull request #1474: LUCENE-9354: Sync French stop words with latest version from Snowball.

Posted by GitBox <gi...@apache.org>.
rmuir commented on pull request #1474:
URL: https://github.com/apache/lucene-solr/pull/1474#issuecomment-622354621


   We shouldn't manually modify these files, but instead update snowball to bring in these stopword changes.
   
   We have to bump these commit hashes: https://github.com/apache/lucene-solr/blob/master/gradle/generation/snowball.gradle#L31-L36


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] rmuir commented on pull request #1474: LUCENE-9354: Sync French stop words with latest version from Snowball.

Posted by GitBox <gi...@apache.org>.
rmuir commented on pull request #1474:
URL: https://github.com/apache/lucene-solr/pull/1474#issuecomment-622362638


   If you can give me permissions, I can push the stuff to your branch and then we can merge the PR?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] rmuir commented on pull request #1474: LUCENE-9354: Sync French stop words with latest version from Snowball.

Posted by GitBox <gi...@apache.org>.
rmuir commented on pull request #1474:
URL: https://github.com/apache/lucene-solr/pull/1474#issuecomment-622361321


   Yeah, we need to run `./gradlew snowball` after bumping the commit to regenerate such files. Otherwise there are problems caused by doing it manually. In this case a whitespace difference between the commit here and the actual file.
   
   ```
   think:lucene-solr[newFrenchStopWords]$ ./gradlew snowball
   To honour the JVM settings for this build a new JVM will be forked. Please consider using the daemon: https://docs.gradle.org/6.0.1/userguide/gradle_daemon.html.
   Daemon will be stopped at the end of the build stopping after processing
   > Task :buildSrc:compileJava
   > Task :buildSrc:compileGroovy NO-SOURCE
   > Task :buildSrc:processResources NO-SOURCE
   > Task :buildSrc:classes
   > Task :buildSrc:jar
   > Task :buildSrc:assemble
   > Task :buildSrc:compileTestJava NO-SOURCE
   > Task :buildSrc:compileTestGroovy NO-SOURCE
   > Task :buildSrc:processTestResources NO-SOURCE
   > Task :buildSrc:testClasses UP-TO-DATE
   > Task :buildSrc:test NO-SOURCE
   > Task :buildSrc:check UP-TO-DATE
   > Task :buildSrc:build
   > Task :lucene:analysis:common:downloadSnowballData
   
   > Task :lucene:analysis:common:downloadSnowballStemmers
   
   > Task :lucene:analysis:common:downloadSnowballWebsite
   Download https://github.com/snowballstem/snowball-website/archive/5a8cf2451d108217585d8e32d744f8b8fd20c711.zip
   
   > Task :lucene:analysis:common:snowballGen
   cc -O2 -W -Wall -Wmissing-prototypes -Wmissing-declarations -Iinclude  -c -o compiler/generator_java.o compiler/generator_java.c
   cc -O2 -W -Wall -Wmissing-prototypes -Wmissing-declarations  -o snowball compiler/space.o compiler/tokeniser.o compiler/analyser.o compiler/generator.o compiler/driver.o compiler/generator_csharp.o compiler/generator_java.o compiler/generator_js.o compiler/generator_pascal.o compiler/generator_python.o compiler/generator_rust.o compiler/generator_go.o
   ./snowball algorithms/arabic.sbl -j -o java/org/tartarus/snowball/ext/arabicStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/armenian.sbl -j -o java/org/tartarus/snowball/ext/armenianStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/basque.sbl -j -o java/org/tartarus/snowball/ext/basqueStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/catalan.sbl -j -o java/org/tartarus/snowball/ext/catalanStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/danish.sbl -j -o java/org/tartarus/snowball/ext/danishStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/dutch.sbl -j -o java/org/tartarus/snowball/ext/dutchStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/english.sbl -j -o java/org/tartarus/snowball/ext/englishStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/estonian.sbl -j -o java/org/tartarus/snowball/ext/estonianStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/finnish.sbl -j -o java/org/tartarus/snowball/ext/finnishStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/french.sbl -j -o java/org/tartarus/snowball/ext/frenchStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/german.sbl -j -o java/org/tartarus/snowball/ext/germanStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/german2.sbl -j -o java/org/tartarus/snowball/ext/german2Stemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/greek.sbl -j -o java/org/tartarus/snowball/ext/greekStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/hindi.sbl -j -o java/org/tartarus/snowball/ext/hindiStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/hungarian.sbl -j -o java/org/tartarus/snowball/ext/hungarianStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/indonesian.sbl -j -o java/org/tartarus/snowball/ext/indonesianStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/irish.sbl -j -o java/org/tartarus/snowball/ext/irishStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/italian.sbl -j -o java/org/tartarus/snowball/ext/italianStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/kraaij_pohlmann.sbl -j -o java/org/tartarus/snowball/ext/kraaij_pohlmannStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/lithuanian.sbl -j -o java/org/tartarus/snowball/ext/lithuanianStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/lovins.sbl -j -o java/org/tartarus/snowball/ext/lovinsStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/nepali.sbl -j -o java/org/tartarus/snowball/ext/nepaliStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/norwegian.sbl -j -o java/org/tartarus/snowball/ext/norwegianStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/porter.sbl -j -o java/org/tartarus/snowball/ext/porterStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/portuguese.sbl -j -o java/org/tartarus/snowball/ext/portugueseStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/romanian.sbl -j -o java/org/tartarus/snowball/ext/romanianStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/russian.sbl -j -o java/org/tartarus/snowball/ext/russianStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/serbian.sbl -j -o java/org/tartarus/snowball/ext/serbianStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/spanish.sbl -j -o java/org/tartarus/snowball/ext/spanishStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/swedish.sbl -j -o java/org/tartarus/snowball/ext/swedishStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/tamil.sbl -j -o java/org/tartarus/snowball/ext/tamilStemmer -p org.tartarus.snowball.SnowballStemmer
   ./snowball algorithms/turkish.sbl -j -o java/org/tartarus/snowball/ext/turkishStemmer -p org.tartarus.snowball.SnowballStemmer
   destname=libstemmer_java; \
   dest=dist/${destname}; \
   rm -rf ${dest} && \
   rm -f ${dest}.tgz && \
   mkdir -p ${dest} && \
   cp -a doc/libstemmer_java_README ${dest}/README && \
   mkdir -p ${dest}/java/org/tartarus/snowball/ext && \
   cp -a java/org/tartarus/snowball/ext/arabicStemmer.java java/org/tartarus/snowball/ext/armenianStemmer.java java/org/tartarus/snowball/ext/basqueStemmer.java java/org/tartarus/snowball/ext/catalanStemmer.java java/org/tartarus/snowball/ext/danishStemmer.java java/org/tartarus/snowball/ext/dutchStemmer.java java/org/tartarus/snowball/ext/englishStemmer.java java/org/tartarus/snowball/ext/estonianStemmer.java java/org/tartarus/snowball/ext/finnishStemmer.java java/org/tartarus/snowball/ext/frenchStemmer.java java/org/tartarus/snowball/ext/germanStemmer.java java/org/tartarus/snowball/ext/german2Stemmer.java java/org/tartarus/snowball/ext/greekStemmer.java java/org/tartarus/snowball/ext/hindiStemmer.java java/org/tartarus/snowball/ext/hungarianStemmer.java java/org/tartarus/snowball/ext/indonesianStemmer.java java/org/tartarus/snowball/ext/irishStemmer.java java/org/tartarus/snowball/ext/italianStemmer.java java/org/tartarus/snowball/ext/kraaij_pohlmannStemmer.java java/org/tartarus/snowball/ext/lithuanianStemmer.java java/org/tartarus/snowball/ext/lovinsStemmer.java java/org/tartarus/snowball/ext/nepaliStemmer.java java/org/tartarus/snowball/ext/norwegianStemmer.java java/org/tartarus/snowball/ext/porterStemmer.java java/org/tartarus/snowball/ext/portugueseStemmer.java java/org/tartarus/snowball/ext/romanianStemmer.java java/org/tartarus/snowball/ext/russianStemmer.java java/org/tartarus/snowball/ext/serbianStemmer.java java/org/tartarus/snowball/ext/spanishStemmer.java java/org/tartarus/snowball/ext/swedishStemmer.java java/org/tartarus/snowball/ext/tamilStemmer.java java/org/tartarus/snowball/ext/turkishStemmer.java ${dest}/java/org/tartarus/snowball/ext && \
   mkdir -p ${dest}/java/org/tartarus/snowball && \
   cp -a java/org/tartarus/snowball/Among.java java/org/tartarus/snowball/SnowballProgram.java java/org/tartarus/snowball/SnowballStemmer.java java/org/tartarus/snowball/TestApp.java ${dest}/java/org/tartarus/snowball && \
   cp -a COPYING NEWS ${dest} && \
   (cd ${dest} && \
    echo "README" >> MANIFEST && \
    ls java/org/tartarus/snowball/ext/*.java >> MANIFEST && \
    ls java/org/tartarus/snowball/*.java >> MANIFEST) && \
   (cd dist && tar zcf ${destname}.tgz ${destname}) && \
   rm -rf ${dest}
   
   > Task :snowball
   
   BUILD SUCCESSFUL in 18s
   4 actionable tasks: 4 executed
   <-------------> 0% WAITING
   think:lucene-solr[newFrenchStopWords]$ git status
   On branch newFrenchStopWords
   Your branch is up to date with 'camellia/newFrenchStopWords'.
   
   Changes not staged for commit:
     (use "git add <file>..." to update what will be committed)
     (use "git restore <file>..." to discard changes in working directory)
   	modified:   lucene/analysis/common/src/resources/org/apache/lucene/analysis/snowball/french_stop.txt
   
   no changes added to commit (use "git add" and/or "git commit -a")
   think:lucene-solr[newFrenchStopWords]$ git diff
   diff --git a/lucene/analysis/common/src/resources/org/apache/lucene/analysis/snowball/french_stop.txt b/lucene/analysis/common/src/resources/org/apache/lucene/analysis/snowball/french_stop.txt
   index 82404502b43..658ae9c91ac 100644
   --- a/lucene/analysis/common/src/resources/org/apache/lucene/analysis/snowball/french_stop.txt
   +++ b/lucene/analysis/common/src/resources/org/apache/lucene/analysis/snowball/french_stop.txt
   @@ -183,3 +183,4 @@ quelle         |  which
    quelles        |  which
    sans           |  without
    soi            |  oneself
   +
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] filcius commented on pull request #1474: LUCENE-9354: Sync French stop words with latest version from Snowball.

Posted by GitBox <gi...@apache.org>.
filcius commented on pull request #1474:
URL: https://github.com/apache/lucene-solr/pull/1474#issuecomment-622359248


   I did not run gradle or compile anything here. I don't have that kind of setup here. If it is required, I propose someone else commit to my branch and continue my work.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] filcius commented on pull request #1474: LUCENE-9354: Sync French stop words with latest version from Snowball.

Posted by GitBox <gi...@apache.org>.
filcius commented on pull request #1474:
URL: https://github.com/apache/lucene-solr/pull/1474#issuecomment-622397809


   Looks good, thanks @rmuir 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] filcius commented on pull request #1474: LUCENE-9354: Sync French stop words with latest version from Snowball.

Posted by GitBox <gi...@apache.org>.
filcius commented on pull request #1474:
URL: https://github.com/apache/lucene-solr/pull/1474#issuecomment-622360810


   I dont see any other changes in snowball-website, so this may be fine https://github.com/snowballstem/snowball-website/compare/ff891e74f08e7315523ee3c0cad55bb1b7831b9d...master


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org