You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/03/23 07:17:29 UTC

[GitHub] [lucene] dweiss commented on a change in pull request #759: LUCENE-9651 Update benchmark module docs

dweiss commented on a change in pull request #759:
URL: https://github.com/apache/lucene/pull/759#discussion_r832917684



##########
File path: gradle/datasets/external-datasets.gradle
##########
@@ -120,10 +120,9 @@ configure(project(":lucene:benchmark")) {
   task getReuters(type: Download) {
     ext {
       name = "reuters21578"
-      // note: there is no HTTPS url and we don't care because this is merely test/perf data
-      src = "http://www.daviddlewis.com/resources/testcollections/reuters21578/${name}.tar.gz"
+      src = "https://kdd.ics.uci.edu/databases/${name}/${name}.tar.gz"
       intermediate = file("${dataDir}/${name}.tar.gz")
-      dst = file("${dataDir}/${name}")

Review comment:
       I'd keep the reutrers21578 name here (not reuters-out).

##########
File path: lucene/benchmark/conf/micro-standard.alg
##########
@@ -30,8 +30,7 @@ doc.tokenized=true
 doc.term.vector=false
 log.step=500
 
-work.dir=data
-docs.dir=reuters21578
+docs.dir=reuters-out

Review comment:
       Same here.

##########
File path: lucene/benchmark/conf/readContentSource.alg
##########
@@ -24,7 +24,7 @@
 #
 # To use this, first cd to benchmark and then run:
 #
-#   ant run-task -Dtask.alg=conf/readContentSource.alg
+#   gradlew run -P task.alg=conf/readContentSource.alg

Review comment:
       I don't think it'll work if you run gradlew from the top-level of the project and give a module-relative path to -Ptask.alg (haven't checked). If task.alg is resolved relative to the module then maybe it should be resolved relative to conf (not to give a false impression that it's a "path" relative to cwd)?
   
   The second thing is that the run task may be present in other modules besides benchmarks; I would give an explicit task path here: gradlew :lucene:benchmarks:run -Ptask.alg=... so that you can't accidentally start Luke or something else, accidently.
   

##########
File path: lucene/benchmark/conf/tokenize.alg
##########
@@ -22,7 +22,7 @@
 #
 # To use this, cd to benchmark and then run:
 #
-#   ant run-task -Dtask.alg=conf/tokenize.alg
+#   gradlew run -P task.alg=conf/tokenize.alg

Review comment:
       This white space between -P and the property is somehow disturbing. Even if it works, it's really not following the convention syntax of how properties are passed to gradle [1].
   https://docs.gradle.org/current/userguide/command_line_interface.html#sec:environment_options




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org