You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2020/02/16 00:19:12 UTC

[GitHub] [lucene-solr] rmuir opened a new pull request #1262: LUCENE-9220: regenerate all stemmers from snowball 2.0

rmuir opened a new pull request #1262: LUCENE-9220: regenerate all stemmers from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262
 
 
   Instead of patching them after-the-fact (both manually and
   automatically over the years) we patch the generator.
   
   This is easier to maintain than patches/changes against generated code.
   See LUCENE-9220 for more information.
   
   There is a remaining nocommit, test data. Also need to hook in and test
   the new languages that are added here.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380138880
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
+    dependsOn downloadSnowballStemmers
+    delete snowballPatchedDir
+  }
+
+  task patchSnowball(type: Copy) {
+    dependsOn cleanSnowballCheckout
+
+    from fileTree(snowballStemmerDir) {
+      include '**/*'
+    }
+    into snowballPatchedDir
+
+    doLast {
+      ant.patch(patchfile: rootProject.file("gradle/generation/snowball.patch"), dir: snowballPatchedDir, strip: "1")
 
 Review comment:
   ant.patch just redirects to local filesystem patch command (doesn't work on Windows). Sigh. Fine with me though.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] rmuir commented on issue #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
rmuir commented on issue #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#issuecomment-586957660
 
 
   I think this is ready, I plan to push it later today. You have to run regenerate on linux: but that is limited by snowball makefile which will not work on windows. so if you want to do this stuff on windows it is best to start over there :)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] rmuir edited a comment on issue #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
rmuir edited a comment on issue #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#issuecomment-587045242
 
 
   @dweiss OK I got it working not downloading every time, and avoids relying on HTTP caching. instead things are named by commit hashes. I had to add an explicit input dependency on the patch file, so that it redownloads and patches if needed.
   
   Now you see on second run:
   ```
   > Task :lucene:analysis:common:downloadSnowballData UP-TO-DATE
   > Task :lucene:analysis:common:downloadSnowballStemmers UP-TO-DATE
   > Task :lucene:analysis:common:downloadSnowballWebsite UP-TO-DATE
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380144818
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
 
 Review comment:
   At first hand the doFirst trick should be equivalent to the task dependencies the way you defined them. Let me know if this turns into a problem though.
   
   A way to properly "download once" would be to define a configuration dependency with a custom resolver and let gradle cache the artifact... but it's too much of work for a small gain I think. What you did is already an improvement and we regenerate that stuff infrequently.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380185634
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
+    dependsOn downloadSnowballStemmers
+    delete snowballPatchedDir
+  }
+
+  task patchSnowball(type: Copy) {
+    dependsOn cleanSnowballCheckout
+
+    from fileTree(snowballStemmerDir) {
+      include '**/*'
+    }
+    into snowballPatchedDir
+
+    doLast {
+      ant.patch(patchfile: rootProject.file("gradle/generation/snowball.patch"), dir: snowballPatchedDir, strip: "1")
 
 Review comment:
   @dweiss So we can add these dependencies, but I just don't want it to patch twice :) That's kinda why their is an explicit "clean" here to create patched/ work area.
   
   I think behavior is fast enough for a first go and we can followup? It takes 26s the first time and 4s the second time for me.
   
   I am worried about making it too smart and declaring such dependencies incorrectly. I want to make sure it does the right thing if you e.g. update commit hash/patch file, or change script, so that its easier to upgrade snowball.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] rmuir merged pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
rmuir merged pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380140182
 
 

 ##########
 File path: lucene/analysis/common/build.xml
 ##########
 @@ -124,14 +122,4 @@
 
   <target name="regenerate" depends="jflex,unicode-data"/>
 
-  <target name="patch-snowball" description="Patches all snowball programs in '${snowball.programs.dir}' to make them work with MethodHandles">
 
 Review comment:
   Thanks for cleaning these up from ant.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380137381
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
 
 Review comment:
   the latter could be defined relative to snowballWorkDir (${snowballWorkDir}/stemmers)?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380152402
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
+    dependsOn downloadSnowballStemmers
+    delete snowballPatchedDir
+  }
+
+  task patchSnowball(type: Copy) {
+    dependsOn cleanSnowballCheckout
+
+    from fileTree(snowballStemmerDir) {
+      include '**/*'
+    }
+    into snowballPatchedDir
+
+    doLast {
+      ant.patch(patchfile: rootProject.file("gradle/generation/snowball.patch"), dir: snowballPatchedDir, strip: "1")
+    }
+  }
+}
+
+configure(project(":lucene:analysis:common")) {
+  task snowballGen() {
+    description "Patch and Regenerate snowball sources, stopwords, and tests"
+    group "generation"
+    dependsOn rootProject.patchSnowball
 
 Review comment:
   ok I think this is now better. Now tasks are under analysis common, downloads stuff to `lucene/analysis/common/build` instead of `build/` and all tasks are under that project, except a single top-level one with the description that depends on it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380190722
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
+    dependsOn downloadSnowballStemmers
+    delete snowballPatchedDir
+  }
+
+  task patchSnowball(type: Copy) {
+    dependsOn cleanSnowballCheckout
+
+    from fileTree(snowballStemmerDir) {
+      include '**/*'
+    }
+    into snowballPatchedDir
+
+    doLast {
+      ant.patch(patchfile: rootProject.file("gradle/generation/snowball.patch"), dir: snowballPatchedDir, strip: "1")
 
 Review comment:
   Yes but if you teach a man to fish... I will look into it. Its just tricky, for example patching task in my mind depends on "cleanCheckoutOfCurrentlySpecifiedCommitHash + input patch file itself". and maybe we must "touch" an output file indicating patch was applied. This way it doesn't try to apply it twice. At least this is how i would do it with something like `make`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380169660
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
+    dependsOn downloadSnowballStemmers
+    delete snowballPatchedDir
+  }
+
+  task patchSnowball(type: Copy) {
+    dependsOn cleanSnowballCheckout
+
+    from fileTree(snowballStemmerDir) {
+      include '**/*'
+    }
+    into snowballPatchedDir
+
+    doLast {
+      ant.patch(patchfile: rootProject.file("gradle/generation/snowball.patch"), dir: snowballPatchedDir, strip: "1")
 
 Review comment:
   That download task is broken in that it still executes actions even if the artifact is up to date... You can just delete the unpacked stuff and not the downloaded zip; then it'll wipe and unzip but not download again?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380160792
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
+    dependsOn downloadSnowballStemmers
+    delete snowballPatchedDir
+  }
+
+  task patchSnowball(type: Copy) {
+    dependsOn cleanSnowballCheckout
+
+    from fileTree(snowballStemmerDir) {
+      include '**/*'
+    }
+    into snowballPatchedDir
+
+    doLast {
+      ant.patch(patchfile: rootProject.file("gradle/generation/snowball.patch"), dir: snowballPatchedDir, strip: "1")
 
 Review comment:
   Maybe we can focus on trying to remove the patch completely (try to package up some PRs for snowball team) and it gets fixed even before their Makefile supports windows :)
   
   I guess I optimistically hope the patch solution is a temporary one. If you look deeper into shell script and snowball GNUMakefile, it may be obvious that we have other issues to fix for windows first.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380139377
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
+    dependsOn downloadSnowballStemmers
+    delete snowballPatchedDir
+  }
+
+  task patchSnowball(type: Copy) {
+    dependsOn cleanSnowballCheckout
+
+    from fileTree(snowballStemmerDir) {
+      include '**/*'
+    }
+    into snowballPatchedDir
+
+    doLast {
+      ant.patch(patchfile: rootProject.file("gradle/generation/snowball.patch"), dir: snowballPatchedDir, strip: "1")
+    }
+  }
+}
+
+configure(project(":lucene:analysis:common")) {
+  task snowballGen() {
+    description "Patch and Regenerate snowball sources, stopwords, and tests"
+    group "generation"
+    dependsOn rootProject.patchSnowball
 
 Review comment:
   Why not move all these tasks into lucene:analysis:common configuration? Are they reused anywhere else? If not then it'd make everything local to the project.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380175604
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
+    dependsOn downloadSnowballStemmers
+    delete snowballPatchedDir
+  }
+
+  task patchSnowball(type: Copy) {
+    dependsOn cleanSnowballCheckout
+
+    from fileTree(snowballStemmerDir) {
+      include '**/*'
+    }
+    into snowballPatchedDir
+
+    doLast {
+      ant.patch(patchfile: rootProject.file("gradle/generation/snowball.patch"), dir: snowballPatchedDir, strip: "1")
 
 Review comment:
   note this same trick (HTTP cache revalidation) cannot be used for moman, because its server disables caching completely and doesn't send such headers. In fact the server does not even respond correctly to requests with `curl` (unless you modify User-Agent header) so it is clear they are not excited about lots of automated requests, so it should be improved some other way...

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380139766
 
 

 ##########
 File path: gradle/generation/snowball.sh
 ##########
 @@ -0,0 +1,122 @@
+#!/usr/bin/env bash
+
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# remove this script when problems are fixed
+SRCDIR=$1
+WWWSRCDIR=$2
+TESTSRCDIR=$3
+PROJECTDIR=$4
+DESTDIR="${PROJECTDIR}/src/java/org/tartarus/snowball"
+WWWDSTDIR="${PROJECTDIR}/src/resources/org/apache/lucene/analysis/snowball"
+TESTDSTDIR="${PROJECTDIR}/src/test/org/apache/lucene/analysis/snowball"
+
+trap 'echo "usage: ./snowball.sh <snowball> <snowball-website> <snowball-data> <analysis-common>" && exit 2' ERR
+test $# -eq 4
+
+trap 'echo "*** BUILD FAILED ***" $BASH_SOURCE:$LINENO: error: "$BASH_COMMAND" returned $?' ERR
+set -eEuo pipefail
+
+# reformats file indentation to kill the crazy space/tabs mix.
+# prevents early blindness !
+function reformat_java() {
+  # convert tabs to 8 spaces, then reduce indent from 4 space to 2 space
+  sed --in-place -e 's/\t/        /g' -e 's/    /  /g' $1
 
 Review comment:
   Love the old school (sed). :)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380182603
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
+    dependsOn downloadSnowballStemmers
+    delete snowballPatchedDir
+  }
+
+  task patchSnowball(type: Copy) {
+    dependsOn cleanSnowballCheckout
+
+    from fileTree(snowballStemmerDir) {
+      include '**/*'
+    }
+    into snowballPatchedDir
+
+    doLast {
+      ant.patch(patchfile: rootProject.file("gradle/generation/snowball.patch"), dir: snowballPatchedDir, strip: "1")
 
 Review comment:
   This isn't something you have to look into, Rob. I'm just adding some comments, that's it. The patch is fine.
   
   The simplest way of making the task not download the file again would be to declare an inputs.file(path-to-zip) I think. But it's really a marginal issue, I wouldn't spend any special time on this if it works.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380144519
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
+    dependsOn downloadSnowballStemmers
+    delete snowballPatchedDir
+  }
+
+  task patchSnowball(type: Copy) {
+    dependsOn cleanSnowballCheckout
+
+    from fileTree(snowballStemmerDir) {
+      include '**/*'
+    }
+    into snowballPatchedDir
+
+    doLast {
+      ant.patch(patchfile: rootProject.file("gradle/generation/snowball.patch"), dir: snowballPatchedDir, strip: "1")
 
 Review comment:
   I think if you have patch.exe on windows then this part will work?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380205862
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
+    dependsOn downloadSnowballStemmers
+    delete snowballPatchedDir
+  }
+
+  task patchSnowball(type: Copy) {
+    dependsOn cleanSnowballCheckout
+
+    from fileTree(snowballStemmerDir) {
+      include '**/*'
+    }
+    into snowballPatchedDir
+
+    doLast {
+      ant.patch(patchfile: rootProject.file("gradle/generation/snowball.patch"), dir: snowballPatchedDir, strip: "1")
 
 Review comment:
   Declare inputs on the zip and move applying the patch to inside the download task action (you don't even have to copy the unpacked folder then), execute patching in doLast; if it fails the entire task won't be up-to-date (will be re-run)? Just a thought.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380145281
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
+    dependsOn downloadSnowballStemmers
+    delete snowballPatchedDir
+  }
+
+  task patchSnowball(type: Copy) {
+    dependsOn cleanSnowballCheckout
+
+    from fileTree(snowballStemmerDir) {
+      include '**/*'
+    }
+    into snowballPatchedDir
+
+    doLast {
+      ant.patch(patchfile: rootProject.file("gradle/generation/snowball.patch"), dir: snowballPatchedDir, strip: "1")
+    }
+  }
+}
+
+configure(project(":lucene:analysis:common")) {
+  task snowballGen() {
+    description "Patch and Regenerate snowball sources, stopwords, and tests"
+    group "generation"
+    dependsOn rootProject.patchSnowball
 
 Review comment:
   Commit it in, I'll correct it afterwards, no worries.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] rmuir commented on issue #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
rmuir commented on issue #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#issuecomment-587045242
 
 
   @dweiss OK I got it working not downloading every time, and avoids relying on HTTP caching. instead things are named by commit hashes. I had to add an explicit input dependency on the patch file, so that it redownloads and patches if needed.
   
   Now you see on second run:
   {noformat}
   > Task :lucene:analysis:common:downloadSnowballData UP-TO-DATE
   > Task :lucene:analysis:common:downloadSnowballStemmers UP-TO-DATE
   > Task :lucene:analysis:common:downloadSnowballWebsite UP-TO-DATE
   {noformat}

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] rmuir commented on issue #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
rmuir commented on issue #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#issuecomment-586982384
 
 
   I am looking into trying to improve the situation to allow it to regenerate on the mac too.
   
   Two initial problems found:
   * need to cleanup sed usage to work with both GNU and BSD sed
   * different test zip files are created because the random seed for sampling is not reproducible. This is because `openssl` is really LibreSSL on the mac...
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380155207
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
+    dependsOn downloadSnowballStemmers
+    delete snowballPatchedDir
+  }
+
+  task patchSnowball(type: Copy) {
+    dependsOn cleanSnowballCheckout
+
+    from fileTree(snowballStemmerDir) {
+      include '**/*'
+    }
+    into snowballPatchedDir
+
+    doLast {
+      ant.patch(patchfile: rootProject.file("gradle/generation/snowball.patch"), dir: snowballPatchedDir, strip: "1")
+    }
+  }
+}
+
+configure(project(":lucene:analysis:common")) {
+  task snowballGen() {
+    description "Patch and Regenerate snowball sources, stopwords, and tests"
+    group "generation"
+    dependsOn rootProject.patchSnowball
 
 Review comment:
   Thanks Robert. FYI: the moman build could be modified the same way. There is a root project dependency section in javacc.gradle but the rationale behind it is that it's a configuration dependency then shared by different subprojects (and a task that requires javacc).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380173659
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
+    dependsOn downloadSnowballStemmers
+    delete snowballPatchedDir
+  }
+
+  task patchSnowball(type: Copy) {
+    dependsOn cleanSnowballCheckout
+
+    from fileTree(snowballStemmerDir) {
+      include '**/*'
+    }
+    into snowballPatchedDir
+
+    doLast {
+      ant.patch(patchfile: rootProject.file("gradle/generation/snowball.patch"), dir: snowballPatchedDir, strip: "1")
 
 Review comment:
   @dweiss I can assure you it does not download again. Please look at the PR, maybe the names are confusing and we can clean it up? I think logic is correct.
   
   Here is what happens (first run):
   * download `github.com/snowball/xyz.zip` to `snowball.zip`. It extracts this to `stemmers/`. It saves Etag to `stemmers.json`.
   * create a clean copy of `stemmers/` named `patched/`.
   * apply patch to `patched/`.
   
   Second run:
   * Tries to download, but sends request with `If-None-Match: a4gk4k4...`. Remote server returns a HTTP 304 (Not Modified). This would not be the case if, e.g. someone bumped the snowball commit hash locally.
   * create a clean copy of `stemmers/` named `patched/`.
   * apply patch to `patched/`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380158861
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
 
 Review comment:
   It worked fine as a doFirst. Here is the issue: it is nice that `gradlew snowball` is fast and not downloading 30MB files (!) every time you run it, if you are trying to iterate locally, for example to actually upgrade it.
   
   So while it may be infrequent, I wanted to make it more pleasant for running this over and over until you get it right, because you will be dealing with plenty of other unpleasant stuff...

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380145098
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
+    dependsOn downloadSnowballStemmers
+    delete snowballPatchedDir
+  }
+
+  task patchSnowball(type: Copy) {
+    dependsOn cleanSnowballCheckout
+
+    from fileTree(snowballStemmerDir) {
+      include '**/*'
+    }
+    into snowballPatchedDir
+
+    doLast {
+      ant.patch(patchfile: rootProject.file("gradle/generation/snowball.patch"), dir: snowballPatchedDir, strip: "1")
+    }
+  }
+}
+
+configure(project(":lucene:analysis:common")) {
+  task snowballGen() {
+    description "Patch and Regenerate snowball sources, stopwords, and tests"
+    group "generation"
+    dependsOn rootProject.patchSnowball
 
 Review comment:
   I can look into it. I am pretty clueless on gradle, I based this on the `moman` regeneration that is located in `gradle/generation/util.gradle`. Because it is doing similar stuff (downloading zip and running commands). 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380175386
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
+    dependsOn downloadSnowballStemmers
+    delete snowballPatchedDir
+  }
+
+  task patchSnowball(type: Copy) {
+    dependsOn cleanSnowballCheckout
+
+    from fileTree(snowballStemmerDir) {
+      include '**/*'
+    }
+    into snowballPatchedDir
+
+    doLast {
+      ant.patch(patchfile: rootProject.file("gradle/generation/snowball.patch"), dir: snowballPatchedDir, strip: "1")
 
 Review comment:
   I know it doesn't download twice but if it declared inputs/ outputs or upToDateWhen correctly then it shouldn't run any actions attached to the task (doFirst/ doLast) and it does it now, regardless of whether it downloaded the artifact or not, right (that was my impression from other sections of the code, no this patch). 
   
   Anyway, if it's working then fine. As for different versions of patch -- throw git/ jgit's implementation in the mix... they all seem to be behaving differently in corner cases.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380143193
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
 
 Review comment:
   yeah mainly i'm after two things:
   * don't download stuff if you have downloaded it before (uses ETag check). Some files are quite large.
   * before patching, make a clean copy of checkout. This way we don't ever try to apply patch twice and don't force downloading again.
   
   We can change to a doFirst I think. Lemme try.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380146063
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
+    dependsOn downloadSnowballStemmers
+    delete snowballPatchedDir
+  }
+
+  task patchSnowball(type: Copy) {
+    dependsOn cleanSnowballCheckout
+
+    from fileTree(snowballStemmerDir) {
+      include '**/*'
+    }
+    into snowballPatchedDir
+
+    doLast {
+      ant.patch(patchfile: rootProject.file("gradle/generation/snowball.patch"), dir: snowballPatchedDir, strip: "1")
 
 Review comment:
   Not many people do (I don't). You'd need to run under cygwin or have msys tools installed (which are recompiled for Windows). I don't think this is a problem. I don't think there is a single Windows developer who isn't fluent on other environments. If a regeneration task does not work on Windows so be it (I'd make it complain gracefully but this can come later).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380138104
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
 
 Review comment:
   The "cleanXXX" is a rule in gradle... don't know if it's not going to clash with something. Perhaps rename to "recompileSnowballCheckout" or something like this?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380189395
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
+    dependsOn downloadSnowballStemmers
+    delete snowballPatchedDir
+  }
+
+  task patchSnowball(type: Copy) {
+    dependsOn cleanSnowballCheckout
+
+    from fileTree(snowballStemmerDir) {
+      include '**/*'
+    }
+    into snowballPatchedDir
+
+    doLast {
+      ant.patch(patchfile: rootProject.file("gradle/generation/snowball.patch"), dir: snowballPatchedDir, strip: "1")
 
 Review comment:
   It wasn't my intention to change the patch - these were my random musings about gradle, the download task, etc. It's definitely fine for committing in and can be improved later (or never if we can convince people from snowball to generate better code).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] rmuir commented on issue #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
rmuir commented on issue #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#issuecomment-586714328
 
 
   I've got gradle logic working to regenerate snowball. Its currently slow because the script is slow (vim-reformat takes forever). I will see if its enough to fix indentation with simple whitespace replacement. @dweiss will certainly hate it :), but the whole procedure was never automated before, had to start somewhere.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380178308
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
+    dependsOn downloadSnowballStemmers
+    delete snowballPatchedDir
+  }
+
+  task patchSnowball(type: Copy) {
+    dependsOn cleanSnowballCheckout
+
+    from fileTree(snowballStemmerDir) {
+      include '**/*'
+    }
+    into snowballPatchedDir
+
+    doLast {
+      ant.patch(patchfile: rootProject.file("gradle/generation/snowball.patch"), dir: snowballPatchedDir, strip: "1")
 
 Review comment:
   @dweiss ok, I will look at inputs/outputs/upToDateWhen. Maybe a simpler way is for these tools to download to a file with commit's hash in the name and declare such stuff? We'd be able to fix moman to not download over and over too (cache headers cannot work over there).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380138559
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
 
 Review comment:
   Oh. Could also be a doFirst { project.delete snowballPatchedDir } on the downloadSnowballStemmers?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] rmuir commented on issue #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
rmuir commented on issue #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#issuecomment-587021516
 
 
   OK, `gradlew snowball` now works on my mac and linux and regenerates all files without any changes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380146604
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
+    snowballPatchedDir = file("${buildDir}/snowball/patched")
+    snowballWebsiteDir = file("${buildDir}/snowball/website")
+    snowballDataDir    = file("${buildDir}/snowball/data")
+  }
+
+  task snowball()  {
+    description "Regenerate snowball-based sources, stopwords, and tests for ...lucene/analysis."
+    group "generation"
+
+    dependsOn ":lucene:analysis:common:snowballGen"
+  }
+
+  task downloadSnowballStemmers(type: Download) {
+    def stemmerZip = file("${snowballWorkDir}/stemmers.zip")
+
+    src "https://github.com/snowballstem/snowball/archive/53739a805cfa6c77ff8496dc711dc1c106d987c1.zip"
+    dest stemmerZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/stemmers.json")
+
+    doLast {
+      ant.unzip(src: stemmerZip, dest: snowballStemmerDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballWebsite(type: Download) {
+    def websiteZip = file("${snowballWorkDir}/website.zip")
+
+    src "https://github.com/snowballstem/snowball-website/archive/ff891e74f08e7315523ee3c0cad55bb1b7831b9d.zip"
+    dest websiteZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/website.json")
+
+    doLast {
+      ant.unzip(src: websiteZip, dest: snowballWebsiteDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task downloadSnowballData(type: Download) {
+    def dataZip = file("${snowballWorkDir}/data.zip")
+
+    src "https://github.com/snowballstem/snowball-data/archive/9145f8732ec952c8a3d1066be251da198a8bc792.zip"
+    dest dataZip
+    onlyIfModified true
+    useETag "all"
+    cachedETagsFile file("${snowballWorkDir}/data.json")
+
+    doLast {
+      ant.unzip(src: dataZip, dest: snowballDataDir, overwrite: "true") {
+        ant.cutdirsmapper(dirs: "1")
+      }
+    }
+  }
+
+  task cleanSnowballCheckout(type: Delete) {
+    dependsOn downloadSnowballStemmers
+    delete snowballPatchedDir
+  }
+
+  task patchSnowball(type: Copy) {
+    dependsOn cleanSnowballCheckout
+
+    from fileTree(snowballStemmerDir) {
+      include '**/*'
+    }
+    into snowballPatchedDir
+
+    doLast {
+      ant.patch(patchfile: rootProject.file("gradle/generation/snowball.patch"), dir: snowballPatchedDir, strip: "1")
 
 Review comment:
   I looked into platform-independent patch command implementation but I didn't find any (jgit's is inherently broken; @kojisekig wrote something for jenkins but I didn't look into that - maybe he'll have some more insight).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0

Posted by GitBox <gi...@apache.org>.
rmuir commented on a change in pull request #1262: LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0
URL: https://github.com/apache/lucene-solr/pull/1262#discussion_r380144024
 
 

 ##########
 File path: gradle/generation/snowball.gradle
 ##########
 @@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  ext {
+    snowballWorkDir    = file("${buildDir}/snowball")
+    snowballStemmerDir = file("${buildDir}/snowball/stemmers")
 
 Review comment:
   Yes, duh. I'll fix!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org