You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/11/07 20:37:04 UTC

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #4128: Combined TPCH runs & uniformed summaries for benchmarks

alamb commented on code in PR #4128:
URL: https://github.com/apache/arrow-datafusion/pull/4128#discussion_r1015873324


##########
benchmarks/src/bin/tpch.rs:
##########
@@ -64,7 +64,7 @@ static ALLOC: mimalloc::MiMalloc = mimalloc::MiMalloc;
 struct DataFusionBenchmarkOpt {
     /// Query number

Review Comment:
   The docstrings end up in the output of `--help` so I think it would be nice to mention what happens if this is not specified
   
   ```suggestion
       /// Query number. If not specified runs all queries
   ```



##########
benchmarks/README.md:
##########
@@ -49,6 +49,11 @@ The benchmark can then be run (assuming the data created from `dbgen` is in `./d
 cargo run --release --bin tpch -- benchmark datafusion --iterations 3 --path ./data --format tbl --query 1 --batch-size 4096
 ```
 
+If you omit `--query=<query_id>` argument, then all benchmarks will be run one by one (from query 1 to query 22).
+```bash
+cargo run --release --bin tpch -- benchmark datafusion --iterations 1 --path ./data --format tbl --query 1 --batch-size 4096

Review Comment:
   should this example perhaps not have `--query 1`?
   
   ```suggestion
   cargo run --release --bin tpch -- benchmark datafusion --iterations 1 --path ./data --format tbl --batch-size 4096
   ```



##########
benchmarks/src/bin/tpch.rs:
##########
@@ -182,29 +182,57 @@ async fn main() -> Result<()> {
     }
 }
 
-async fn benchmark_datafusion(opt: DataFusionBenchmarkOpt) -> Result<Vec<RecordBatch>> {
+const TPCH_QUERY_START_ID: usize = 1;
+const TPCH_QUERY_END_ID: usize = 22;
+
+async fn benchmark_datafusion(
+    opt: DataFusionBenchmarkOpt,
+) -> Result<Vec<Vec<RecordBatch>>> {
     println!("Running benchmarks with the following options: {:?}", opt);
-    let mut benchmark_run = BenchmarkRun::new(opt.query);
+    let query_range = match opt.query {

Review Comment:
   👍 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org