You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/06/10 08:36:22 UTC

[GitHub] [lucene] dweiss opened a new pull request #178: LUCENE-9977: rat task corrections (proper up-to-date checks, cleanup and rewrite of the task itself).

dweiss opened a new pull request #178:
URL: https://github.com/apache/lucene/pull/178


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a change in pull request #178: LUCENE-9977: rat task corrections (proper up-to-date checks, cleanup and rewrite of the task itself).

Posted by GitBox <gi...@apache.org>.
uschindler commented on a change in pull request #178:
URL: https://github.com/apache/lucene/pull/178#discussion_r649145401



##########
File path: gradle/validation/rat-sources.gradle
##########
@@ -74,6 +74,9 @@ allprojects {
                     exclude ".idea"
                     exclude ".muse"
 
+                    // Exclude github stuff (templates, workflows).
+                    exclude ".github"

Review comment:
       should we not also exclude the ".git" folder itsself, or is this done by default?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] dweiss commented on a change in pull request #178: LUCENE-9977: rat task corrections (proper up-to-date checks, cleanup and rewrite of the task itself).

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #178:
URL: https://github.com/apache/lucene/pull/178#discussion_r648973848



##########
File path: gradle/validation/rat-sources.gradle
##########
@@ -27,139 +28,122 @@ configure(rootProject) {
     }
 }
 
+// Configure the rat validation task and all scanned directories.
 allprojects {
     task("rat", type: RatTask) {
         group = 'Verification'
         description = 'Runs Apache Rat checks.'
-    }
-}
-
-configure(rootProject) {
-    rat {
-        includes += [
-            "buildSrc/**/*.java",
-            "gradle/**/*.gradle",
-            "lucene/tools/forbiddenApis/**",
-            "lucene/tools/prettify/**",
-        ]
-        excludes += [
-            // Unclear if this needs ASF header, depends on how much was copied from ElasticSearch
-            "**/ErrorReportingTestListener.java"
-        ]
-    }
-}
-
-configure(project(":lucene:analysis:common")) {
-    rat {
-        srcExcludes += [
-            "**/*.aff",
-            "**/*.dic",
-            "**/*.wrong",
-            "**/*.good",
-            "**/*.sug",
-            "**/charfilter/*.htm*",
-            "**/*LuceneResourcesWikiPage.html"
-        ]
-    }
-}
-
-configure(project(":lucene:analysis:kuromoji")) {
-    rat {
-        srcExcludes += [
-            // whether rat detects this as binary or not is platform dependent?!
-            "**/bocchan.utf-8"
-        ]
-    }
-}
 
-configure(project(":lucene:analysis:opennlp")) {
-    rat {
-        excludes += [
-            "src/tools/test-model-data/*.txt",
-        ]
-    }
-}
-
-configure(project(":lucene:highlighter")) {
-    rat {
-        srcExcludes += [
-            "**/CambridgeMA.utf8"
-        ]
-    }
-}
-
-configure(project(":lucene:suggest")) {
-    rat {
-        srcExcludes += [
-            "**/Top50KWiki.utf8",
-            "**/stop-snowball.txt"
-        ]
+        def defaultScanFileTree = project.fileTree(projectDir, {
+            // Don't check under the project's build folder.
+            exclude project.buildDir.name
+
+            // Exclude any generated stuff.
+            exclude "src/generated"
+
+            // Don't check any of the subprojects - they have their own rat tasks.
+            exclude subprojects.collect { it.projectDir.name }
+
+            // At the module scope we only check selected file patterns as folks have various .gitignore-d resources
+            // generated by IDEs, etc.
+            include "**/*.gradle"
+            include "**/*.xml"
+            include "**/*.md"
+            include "**/*.py"
+            include "**/*.sh"
+            include "**/*.bat"
+
+            // Include selected patterns from any source folders. We could make this
+            // relative to source sets but it seems to be of little value - all our source sets
+            // live under 'src' anyway.
+            include "src/**"
+            exclude "src/**/*.png"
+            exclude "src/**/*.txt"
+            exclude "src/**/*.zip"
+            exclude "src/**/*.properties"
+            exclude "src/**/*.utf8"
+
+            // Conditionally apply module-specific patterns. We do it here instead
+            // of reconfiguring each project because the provider can be made lazy
+            // and it's easier to manage this way.
+            switch (project.path) {
+                case ":":
+                    include "gradlew"
+                    include "gradlew.bat"
+                    exclude ".gradle"
+                    exclude ".idea"
+                    exclude ".muse"
+
+                    // The root project also includes patterns for the boostrap (buildSrc) and composite
+                    // projects. Include their sources in the scan.
+                    include "buildSrc/src/**"
+                    include "dev-tools/missing-doclet/src/**"
+                    break
+
+                case ":lucene:analysis:morfologik":
+                    exclude "src/**/*.info"
+                    exclude "src/**/*.input"
+                    break
+
+                case ":lucene:analysis:opennlp":
+                    exclude "src/**/en-test-lemmas.dict"
+                    break
+
+                case ":lucene:test-framework":
+                    exclude "src/**/europarl.lines.txt.seek"
+                    break
+
+                case ":lucene:analysis:common":
+                    exclude "src/**/*.aff"
+                    exclude "src/**/*.dic"
+                    exclude "src/**/*.good"
+                    exclude "src/**/*.sug"
+                    exclude "src/**/*.wrong"
+                    exclude "src/**/charfilter/*.htm*"
+                    exclude "src/**/*LuceneResourcesWikiPage.html"
+                    exclude "src/**/*.rslp"
+                    break
+
+                case ":lucene:benchmark":
+                    exclude "data/"
+                    break
+            }
+        })
+        inputFileTrees.add(defaultScanFileTree)
     }
 }
 
-// Structure inspired by existing task from Apache Kafka, heavily modified since then.
+/**
+ * An Apache RAT adapter that validates whether files contain acceptable licenses.
+ */
 class RatTask extends DefaultTask {
-    @Input
-    List<String> includes = [
-        "*.gradle",
-        "*.xml",
-        "src/tools/**"
-    ]
-
-    @Input
-    List<String> excludes = []
-
-    @Input
-    List<String> srcExcludes = [
-        "**/TODO",
-        "**/*.txt",
-        "**/*.md",
-        "**/*.iml",
-        "build/**"
-    ]
+    @InputFiles
+    ListProperty<ConfigurableFileTree> inputFileTrees = project.objects.listProperty(ConfigurableFileTree)

Review comment:
       This is intentionally left as a list of file trees. We only use a single file tree but perhaps it'll be useful in the future if we had multiple file trees as an input.

##########
File path: gradle/validation/rat-sources.gradle
##########
@@ -27,139 +28,122 @@ configure(rootProject) {
     }
 }
 
+// Configure the rat validation task and all scanned directories.
 allprojects {
     task("rat", type: RatTask) {
         group = 'Verification'
         description = 'Runs Apache Rat checks.'
-    }
-}
-
-configure(rootProject) {
-    rat {
-        includes += [
-            "buildSrc/**/*.java",
-            "gradle/**/*.gradle",
-            "lucene/tools/forbiddenApis/**",
-            "lucene/tools/prettify/**",
-        ]
-        excludes += [
-            // Unclear if this needs ASF header, depends on how much was copied from ElasticSearch
-            "**/ErrorReportingTestListener.java"
-        ]
-    }
-}
-
-configure(project(":lucene:analysis:common")) {
-    rat {
-        srcExcludes += [
-            "**/*.aff",
-            "**/*.dic",
-            "**/*.wrong",
-            "**/*.good",
-            "**/*.sug",
-            "**/charfilter/*.htm*",
-            "**/*LuceneResourcesWikiPage.html"
-        ]
-    }
-}
-
-configure(project(":lucene:analysis:kuromoji")) {
-    rat {
-        srcExcludes += [
-            // whether rat detects this as binary or not is platform dependent?!
-            "**/bocchan.utf-8"
-        ]
-    }
-}
 
-configure(project(":lucene:analysis:opennlp")) {
-    rat {
-        excludes += [
-            "src/tools/test-model-data/*.txt",
-        ]
-    }
-}
-
-configure(project(":lucene:highlighter")) {
-    rat {
-        srcExcludes += [
-            "**/CambridgeMA.utf8"
-        ]
-    }
-}
-
-configure(project(":lucene:suggest")) {
-    rat {
-        srcExcludes += [
-            "**/Top50KWiki.utf8",
-            "**/stop-snowball.txt"
-        ]
+        def defaultScanFileTree = project.fileTree(projectDir, {
+            // Don't check under the project's build folder.
+            exclude project.buildDir.name
+
+            // Exclude any generated stuff.
+            exclude "src/generated"
+
+            // Don't check any of the subprojects - they have their own rat tasks.
+            exclude subprojects.collect { it.projectDir.name }
+
+            // At the module scope we only check selected file patterns as folks have various .gitignore-d resources
+            // generated by IDEs, etc.
+            include "**/*.gradle"
+            include "**/*.xml"
+            include "**/*.md"
+            include "**/*.py"
+            include "**/*.sh"
+            include "**/*.bat"
+
+            // Include selected patterns from any source folders. We could make this
+            // relative to source sets but it seems to be of little value - all our source sets
+            // live under 'src' anyway.
+            include "src/**"
+            exclude "src/**/*.png"
+            exclude "src/**/*.txt"
+            exclude "src/**/*.zip"
+            exclude "src/**/*.properties"
+            exclude "src/**/*.utf8"
+
+            // Conditionally apply module-specific patterns. We do it here instead
+            // of reconfiguring each project because the provider can be made lazy
+            // and it's easier to manage this way.
+            switch (project.path) {
+                case ":":
+                    include "gradlew"
+                    include "gradlew.bat"
+                    exclude ".gradle"
+                    exclude ".idea"
+                    exclude ".muse"
+
+                    // The root project also includes patterns for the boostrap (buildSrc) and composite
+                    // projects. Include their sources in the scan.
+                    include "buildSrc/src/**"
+                    include "dev-tools/missing-doclet/src/**"
+                    break
+
+                case ":lucene:analysis:morfologik":
+                    exclude "src/**/*.info"
+                    exclude "src/**/*.input"
+                    break
+
+                case ":lucene:analysis:opennlp":
+                    exclude "src/**/en-test-lemmas.dict"
+                    break
+
+                case ":lucene:test-framework":
+                    exclude "src/**/europarl.lines.txt.seek"
+                    break
+
+                case ":lucene:analysis:common":
+                    exclude "src/**/*.aff"
+                    exclude "src/**/*.dic"
+                    exclude "src/**/*.good"
+                    exclude "src/**/*.sug"
+                    exclude "src/**/*.wrong"
+                    exclude "src/**/charfilter/*.htm*"
+                    exclude "src/**/*LuceneResourcesWikiPage.html"
+                    exclude "src/**/*.rslp"
+                    break
+
+                case ":lucene:benchmark":
+                    exclude "data/"
+                    break
+            }
+        })
+        inputFileTrees.add(defaultScanFileTree)
     }
 }
 
-// Structure inspired by existing task from Apache Kafka, heavily modified since then.
+/**
+ * An Apache RAT adapter that validates whether files contain acceptable licenses.
+ */
 class RatTask extends DefaultTask {
-    @Input
-    List<String> includes = [
-        "*.gradle",
-        "*.xml",
-        "src/tools/**"
-    ]
-
-    @Input
-    List<String> excludes = []
-
-    @Input
-    List<String> srcExcludes = [
-        "**/TODO",
-        "**/*.txt",
-        "**/*.md",
-        "**/*.iml",
-        "build/**"
-    ]
+    @InputFiles
+    ListProperty<ConfigurableFileTree> inputFileTrees = project.objects.listProperty(ConfigurableFileTree)
 
     @OutputFile
-    def xmlReport = new File(new File(project.buildDir, 'rat'), 'rat-report.xml')
+    RegularFileProperty xmlReport = project.objects.fileProperty().convention(
+        project.layout.buildDirectory.file("rat/rat-report.xml"))
 
-    def generateXmlReport() {
+    def generateReport(File reportFile) {
+        // Set up ant rat task.
         def uri = 'antlib:org.apache.rat.anttasks'
         def ratClasspath = project.rootProject.configurations.ratDeps.asPath
         ant.taskdef(resource: 'org/apache/rat/anttasks/antlib.xml', uri: uri, classpath: ratClasspath)
-
         def rat = NamespaceBuilder.newInstance(ant, uri)
-        rat.report(format: 'xml', reportFile: xmlReport, addDefaultLicenseMatchers: true) {
-            ant.fileset(dir: "${project.projectDir}") {
-                includes.each { pattern -> ant.include(name: pattern) }
-                excludes.each { pattern -> ant.exclude(name: pattern) }
-            }
 
-            if (project.plugins.findPlugin(JavaPlugin)) {
-                def checkSets = [
-                    project.sourceSets.main.java.srcDirs,
-                    project.sourceSets.test.java.srcDirs,
-                ]
-
-                project.sourceSets.matching { it.name == 'tools' }.all {
-                    checkSets += project.sourceSets.tools.java.srcDirs
-                }
-
-                checkSets.flatten().each { srcLocation ->
-                    ant.fileset(dir: srcLocation, erroronmissingdir: false) {
-                        srcExcludes.each { pattern -> ant.exclude(name: pattern) }
-                    }
-                }
-
-                [
-                    project.sourceSets.main.resources.srcDirs
-                ].flatten().each { srcLocation ->
-                    ant.fileset(dir: srcLocation, erroronmissingdir: false) {
-                        ant.include(name: "META-INF/**")
-                    }
-                }
+        // Collect all output files for debugging.
+        String inputFileList = inputFileTrees.get().collectMany { fileTree ->
+            fileTree.asList()
+        }.sort().join("\n")
+        project.file(reportFile.path.replaceAll('.xml$', '-filelist.txt')).setText(inputFileList, "UTF-8")

Review comment:
       This generates and writes a list of files processed as a sibling of the rat report file. Easy to see what was actually included in the check.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] dweiss commented on a change in pull request #178: LUCENE-9977: rat task corrections (proper up-to-date checks, cleanup and rewrite of the task itself).

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #178:
URL: https://github.com/apache/lucene/pull/178#discussion_r649163037



##########
File path: gradle/validation/rat-sources.gradle
##########
@@ -74,6 +74,9 @@ allprojects {
                     exclude ".idea"
                     exclude ".muse"
 
+                    // Exclude github stuff (templates, workflows).
+                    exclude ".github"

Review comment:
       You could. None of the default inclusion patterns match anything in there, I guess. I found that "debug" file produced by rat very useful to figure out what the checked files were. I'll add .git to excluded folders.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] dweiss commented on a change in pull request #178: LUCENE-9977: rat task corrections (proper up-to-date checks, cleanup and rewrite of the task itself).

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #178:
URL: https://github.com/apache/lucene/pull/178#discussion_r649092605



##########
File path: .github/PULL_REQUEST_TEMPLATE.md
##########
@@ -1,3 +1,20 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to You under the Apache License, Version 2.0
+    the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+ -->
+

Review comment:
       Sorry, I was too dumb at updating the offenders. I think we can exclude it - will do so in a sec.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] dweiss merged pull request #178: LUCENE-9977: rat task corrections (proper up-to-date checks, cleanup and rewrite of the task itself).

Posted by GitBox <gi...@apache.org>.
dweiss merged pull request #178:
URL: https://github.com/apache/lucene/pull/178


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] dweiss commented on a change in pull request #178: LUCENE-9977: rat task corrections (proper up-to-date checks, cleanup and rewrite of the task itself).

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #178:
URL: https://github.com/apache/lucene/pull/178#discussion_r649162120



##########
File path: gradle/validation/rat-sources.gradle
##########
@@ -27,139 +28,125 @@ configure(rootProject) {
     }
 }
 
+// Configure the rat validation task and all scanned directories.
 allprojects {
     task("rat", type: RatTask) {
         group = 'Verification'
         description = 'Runs Apache Rat checks.'
-    }
-}
-
-configure(rootProject) {
-    rat {
-        includes += [
-            "buildSrc/**/*.java",
-            "gradle/**/*.gradle",
-            "lucene/tools/forbiddenApis/**",
-            "lucene/tools/prettify/**",
-        ]
-        excludes += [
-            // Unclear if this needs ASF header, depends on how much was copied from ElasticSearch
-            "**/ErrorReportingTestListener.java"
-        ]
-    }
-}
-
-configure(project(":lucene:analysis:common")) {
-    rat {
-        srcExcludes += [
-            "**/*.aff",
-            "**/*.dic",
-            "**/*.wrong",
-            "**/*.good",
-            "**/*.sug",
-            "**/charfilter/*.htm*",
-            "**/*LuceneResourcesWikiPage.html"
-        ]
-    }
-}
-
-configure(project(":lucene:analysis:kuromoji")) {
-    rat {
-        srcExcludes += [
-            // whether rat detects this as binary or not is platform dependent?!
-            "**/bocchan.utf-8"
-        ]
-    }
-}
 
-configure(project(":lucene:analysis:opennlp")) {
-    rat {
-        excludes += [
-            "src/tools/test-model-data/*.txt",
-        ]
-    }
-}
-
-configure(project(":lucene:highlighter")) {
-    rat {
-        srcExcludes += [
-            "**/CambridgeMA.utf8"
-        ]
-    }
-}
-
-configure(project(":lucene:suggest")) {
-    rat {
-        srcExcludes += [
-            "**/Top50KWiki.utf8",
-            "**/stop-snowball.txt"
-        ]
+        def defaultScanFileTree = project.fileTree(projectDir, {
+            // Don't check under the project's build folder.
+            exclude project.buildDir.name
+
+            // Exclude any generated stuff.
+            exclude "src/generated"
+
+            // Don't check any of the subprojects - they have their own rat tasks.
+            exclude subprojects.collect { it.projectDir.name }
+
+            // At the module scope we only check selected file patterns as folks have various .gitignore-d resources
+            // generated by IDEs, etc.
+            include "**/*.gradle"
+            include "**/*.xml"
+            include "**/*.md"
+            include "**/*.py"
+            include "**/*.sh"
+            include "**/*.bat"
+
+            // Include selected patterns from any source folders. We could make this
+            // relative to source sets but it seems to be of little value - all our source sets
+            // live under 'src' anyway.
+            include "src/**"
+            exclude "src/**/*.png"
+            exclude "src/**/*.txt"
+            exclude "src/**/*.zip"
+            exclude "src/**/*.properties"
+            exclude "src/**/*.utf8"
+
+            // Conditionally apply module-specific patterns. We do it here instead
+            // of reconfiguring each project because the provider can be made lazy
+            // and it's easier to manage this way.
+            switch (project.path) {
+                case ":":
+                    include "gradlew"
+                    include "gradlew.bat"
+                    exclude ".gradle"
+                    exclude ".idea"
+                    exclude ".muse"
+
+                    // Exclude github stuff (templates, workflows).
+                    exclude ".github"
+
+                    // The root project also includes patterns for the boostrap (buildSrc) and composite
+                    // projects. Include their sources in the scan.
+                    include "buildSrc/src/**"
+                    include "dev-tools/missing-doclet/src/**"
+                    break
+
+                case ":lucene:analysis:morfologik":
+                    exclude "src/**/*.info"
+                    exclude "src/**/*.input"
+                    break
+
+                case ":lucene:analysis:opennlp":
+                    exclude "src/**/en-test-lemmas.dict"
+                    break
+
+                case ":lucene:test-framework":
+                    exclude "src/**/europarl.lines.txt.seek"
+                    break
+
+                case ":lucene:analysis:common":
+                    exclude "src/**/*.aff"
+                    exclude "src/**/*.dic"
+                    exclude "src/**/*.good"
+                    exclude "src/**/*.sug"
+                    exclude "src/**/*.wrong"
+                    exclude "src/**/charfilter/*.htm*"
+                    exclude "src/**/*LuceneResourcesWikiPage.html"
+                    exclude "src/**/*.rslp"
+                    break
+
+                case ":lucene:benchmark":
+                    exclude "data/"
+                    break
+            }
+        })
+        inputFileTrees.add(defaultScanFileTree)
     }
 }
 
-// Structure inspired by existing task from Apache Kafka, heavily modified since then.
+/**
+ * An Apache RAT adapter that validates whether files contain acceptable licenses.
+ */
 class RatTask extends DefaultTask {
-    @Input
-    List<String> includes = [
-        "*.gradle",
-        "*.xml",
-        "src/tools/**"
-    ]
-
-    @Input
-    List<String> excludes = []
-
-    @Input
-    List<String> srcExcludes = [
-        "**/TODO",
-        "**/*.txt",
-        "**/*.md",
-        "**/*.iml",
-        "build/**"
-    ]
+    @InputFiles
+    ListProperty<ConfigurableFileTree> inputFileTrees = project.objects.listProperty(ConfigurableFileTree)
 
     @OutputFile
-    def xmlReport = new File(new File(project.buildDir, 'rat'), 'rat-report.xml')
+    RegularFileProperty xmlReport = project.objects.fileProperty().convention(
+        project.layout.buildDirectory.file("rat/rat-report.xml"))
 
-    def generateXmlReport() {
+    def generateReport(File reportFile) {
+        // Set up ant rat task.
         def uri = 'antlib:org.apache.rat.anttasks'
         def ratClasspath = project.rootProject.configurations.ratDeps.asPath
         ant.taskdef(resource: 'org/apache/rat/anttasks/antlib.xml', uri: uri, classpath: ratClasspath)
-
         def rat = NamespaceBuilder.newInstance(ant, uri)
-        rat.report(format: 'xml', reportFile: xmlReport, addDefaultLicenseMatchers: true) {
-            ant.fileset(dir: "${project.projectDir}") {
-                includes.each { pattern -> ant.include(name: pattern) }
-                excludes.each { pattern -> ant.exclude(name: pattern) }
-            }
 
-            if (project.plugins.findPlugin(JavaPlugin)) {
-                def checkSets = [
-                    project.sourceSets.main.java.srcDirs,
-                    project.sourceSets.test.java.srcDirs,
-                ]
-
-                project.sourceSets.matching { it.name == 'tools' }.all {
-                    checkSets += project.sourceSets.tools.java.srcDirs
-                }
-
-                checkSets.flatten().each { srcLocation ->
-                    ant.fileset(dir: srcLocation, erroronmissingdir: false) {
-                        srcExcludes.each { pattern -> ant.exclude(name: pattern) }
-                    }
-                }
-
-                [
-                    project.sourceSets.main.resources.srcDirs
-                ].flatten().each { srcLocation ->
-                    ant.fileset(dir: srcLocation, erroronmissingdir: false) {
-                        ant.include(name: "META-INF/**")
-                    }
-                }
+        // Collect all output files for debugging.
+        String inputFileList = inputFileTrees.get().collectMany { fileTree ->
+            fileTree.asList()
+        }.sort().join("\n")
+        project.file(reportFile.path.replaceAll('.xml$', '-filelist.txt')).setText(inputFileList, "UTF-8")
+
+        // Run rat via ant.
+        rat.report(format: 'xml', reportFile: reportFile, addDefaultLicenseMatchers: true) {
+            // Pass all gradle file trees to the ant task (Gradle's internal adapters are used).
+            inputFileTrees.get().each { fileTree ->
+                fileTree.addToAntBuilder(ant, 'resources', FileCollection.AntType.ResourceCollection)

Review comment:
       I did and it worked. I thought about a single top-level task too, actually. You could make an alias from any subproject to just depend on the root-level check... but then this would mean re-running the thing for all the files and it's working so nice and fast now that I didn't bother (you could try to make it incremental but this is additional work I don't have time for).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a change in pull request #178: LUCENE-9977: rat task corrections (proper up-to-date checks, cleanup and rewrite of the task itself).

Posted by GitBox <gi...@apache.org>.
uschindler commented on a change in pull request #178:
URL: https://github.com/apache/lucene/pull/178#discussion_r649140811



##########
File path: gradle/validation/rat-sources.gradle
##########
@@ -27,139 +28,125 @@ configure(rootProject) {
     }
 }
 
+// Configure the rat validation task and all scanned directories.
 allprojects {
     task("rat", type: RatTask) {
         group = 'Verification'
         description = 'Runs Apache Rat checks.'
-    }
-}
-
-configure(rootProject) {
-    rat {
-        includes += [
-            "buildSrc/**/*.java",
-            "gradle/**/*.gradle",
-            "lucene/tools/forbiddenApis/**",
-            "lucene/tools/prettify/**",
-        ]
-        excludes += [
-            // Unclear if this needs ASF header, depends on how much was copied from ElasticSearch
-            "**/ErrorReportingTestListener.java"
-        ]
-    }
-}
-
-configure(project(":lucene:analysis:common")) {
-    rat {
-        srcExcludes += [
-            "**/*.aff",
-            "**/*.dic",
-            "**/*.wrong",
-            "**/*.good",
-            "**/*.sug",
-            "**/charfilter/*.htm*",
-            "**/*LuceneResourcesWikiPage.html"
-        ]
-    }
-}
-
-configure(project(":lucene:analysis:kuromoji")) {
-    rat {
-        srcExcludes += [
-            // whether rat detects this as binary or not is platform dependent?!
-            "**/bocchan.utf-8"
-        ]
-    }
-}
 
-configure(project(":lucene:analysis:opennlp")) {
-    rat {
-        excludes += [
-            "src/tools/test-model-data/*.txt",
-        ]
-    }
-}
-
-configure(project(":lucene:highlighter")) {
-    rat {
-        srcExcludes += [
-            "**/CambridgeMA.utf8"
-        ]
-    }
-}
-
-configure(project(":lucene:suggest")) {
-    rat {
-        srcExcludes += [
-            "**/Top50KWiki.utf8",
-            "**/stop-snowball.txt"
-        ]
+        def defaultScanFileTree = project.fileTree(projectDir, {
+            // Don't check under the project's build folder.
+            exclude project.buildDir.name
+
+            // Exclude any generated stuff.
+            exclude "src/generated"
+
+            // Don't check any of the subprojects - they have their own rat tasks.
+            exclude subprojects.collect { it.projectDir.name }
+
+            // At the module scope we only check selected file patterns as folks have various .gitignore-d resources
+            // generated by IDEs, etc.
+            include "**/*.gradle"
+            include "**/*.xml"
+            include "**/*.md"
+            include "**/*.py"
+            include "**/*.sh"
+            include "**/*.bat"
+
+            // Include selected patterns from any source folders. We could make this
+            // relative to source sets but it seems to be of little value - all our source sets
+            // live under 'src' anyway.
+            include "src/**"
+            exclude "src/**/*.png"
+            exclude "src/**/*.txt"
+            exclude "src/**/*.zip"
+            exclude "src/**/*.properties"
+            exclude "src/**/*.utf8"
+
+            // Conditionally apply module-specific patterns. We do it here instead
+            // of reconfiguring each project because the provider can be made lazy
+            // and it's easier to manage this way.
+            switch (project.path) {
+                case ":":
+                    include "gradlew"
+                    include "gradlew.bat"
+                    exclude ".gradle"
+                    exclude ".idea"
+                    exclude ".muse"
+
+                    // Exclude github stuff (templates, workflows).
+                    exclude ".github"
+
+                    // The root project also includes patterns for the boostrap (buildSrc) and composite
+                    // projects. Include their sources in the scan.
+                    include "buildSrc/src/**"
+                    include "dev-tools/missing-doclet/src/**"
+                    break
+
+                case ":lucene:analysis:morfologik":
+                    exclude "src/**/*.info"
+                    exclude "src/**/*.input"
+                    break
+
+                case ":lucene:analysis:opennlp":
+                    exclude "src/**/en-test-lemmas.dict"
+                    break
+
+                case ":lucene:test-framework":
+                    exclude "src/**/europarl.lines.txt.seek"
+                    break
+
+                case ":lucene:analysis:common":
+                    exclude "src/**/*.aff"
+                    exclude "src/**/*.dic"
+                    exclude "src/**/*.good"
+                    exclude "src/**/*.sug"
+                    exclude "src/**/*.wrong"
+                    exclude "src/**/charfilter/*.htm*"
+                    exclude "src/**/*LuceneResourcesWikiPage.html"
+                    exclude "src/**/*.rslp"
+                    break
+
+                case ":lucene:benchmark":
+                    exclude "data/"
+                    break
+            }
+        })
+        inputFileTrees.add(defaultScanFileTree)
     }
 }
 
-// Structure inspired by existing task from Apache Kafka, heavily modified since then.
+/**
+ * An Apache RAT adapter that validates whether files contain acceptable licenses.
+ */
 class RatTask extends DefaultTask {
-    @Input
-    List<String> includes = [
-        "*.gradle",
-        "*.xml",
-        "src/tools/**"
-    ]
-
-    @Input
-    List<String> excludes = []
-
-    @Input
-    List<String> srcExcludes = [
-        "**/TODO",
-        "**/*.txt",
-        "**/*.md",
-        "**/*.iml",
-        "build/**"
-    ]
+    @InputFiles
+    ListProperty<ConfigurableFileTree> inputFileTrees = project.objects.listProperty(ConfigurableFileTree)
 
     @OutputFile
-    def xmlReport = new File(new File(project.buildDir, 'rat'), 'rat-report.xml')
+    RegularFileProperty xmlReport = project.objects.fileProperty().convention(
+        project.layout.buildDirectory.file("rat/rat-report.xml"))
 
-    def generateXmlReport() {
+    def generateReport(File reportFile) {
+        // Set up ant rat task.
         def uri = 'antlib:org.apache.rat.anttasks'
         def ratClasspath = project.rootProject.configurations.ratDeps.asPath
         ant.taskdef(resource: 'org/apache/rat/anttasks/antlib.xml', uri: uri, classpath: ratClasspath)
-
         def rat = NamespaceBuilder.newInstance(ant, uri)
-        rat.report(format: 'xml', reportFile: xmlReport, addDefaultLicenseMatchers: true) {
-            ant.fileset(dir: "${project.projectDir}") {
-                includes.each { pattern -> ant.include(name: pattern) }
-                excludes.each { pattern -> ant.exclude(name: pattern) }
-            }
 
-            if (project.plugins.findPlugin(JavaPlugin)) {
-                def checkSets = [
-                    project.sourceSets.main.java.srcDirs,
-                    project.sourceSets.test.java.srcDirs,
-                ]
-
-                project.sourceSets.matching { it.name == 'tools' }.all {
-                    checkSets += project.sourceSets.tools.java.srcDirs
-                }
-
-                checkSets.flatten().each { srcLocation ->
-                    ant.fileset(dir: srcLocation, erroronmissingdir: false) {
-                        srcExcludes.each { pattern -> ant.exclude(name: pattern) }
-                    }
-                }
-
-                [
-                    project.sourceSets.main.resources.srcDirs
-                ].flatten().each { srcLocation ->
-                    ant.fileset(dir: srcLocation, erroronmissingdir: false) {
-                        ant.include(name: "META-INF/**")
-                    }
-                }
+        // Collect all output files for debugging.
+        String inputFileList = inputFileTrees.get().collectMany { fileTree ->
+            fileTree.asList()
+        }.sort().join("\n")
+        project.file(reportFile.path.replaceAll('.xml$', '-filelist.txt')).setText(inputFileList, "UTF-8")
+
+        // Run rat via ant.
+        rat.report(format: 'xml', reportFile: reportFile, addDefaultLicenseMatchers: true) {
+            // Pass all gradle file trees to the ant task (Gradle's internal adapters are used).
+            inputFileTrees.get().each { fileTree ->
+                fileTree.addToAntBuilder(ant, 'resources', FileCollection.AntType.ResourceCollection)

Review comment:
       perfect!
   Did you do a test by removing one of the headers in some java source file?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on a change in pull request #178: LUCENE-9977: rat task corrections (proper up-to-date checks, cleanup and rewrite of the task itself).

Posted by GitBox <gi...@apache.org>.
rmuir commented on a change in pull request #178:
URL: https://github.com/apache/lucene/pull/178#discussion_r649072881



##########
File path: .github/PULL_REQUEST_TEMPLATE.md
##########
@@ -1,3 +1,20 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to You under the Apache License, Version 2.0
+    the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+ -->
+

Review comment:
       do we want this license going into every PR description by default? It will include the whole license in the PR description by default if we add it like this. (similar to how today it includes the ```<!-- _(If you are a project committer then you may remove some/all of the following template.)_```)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org