You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/02/06 17:36:41 UTC

[GitHub] [arrow-datafusion] andygrove opened a new pull request #1766: Benchmark json summary

andygrove opened a new pull request #1766:
URL: https://github.com/apache/arrow-datafusion/pull/1766


   # Which issue does this PR close?
   
   Closes https://github.com/apache/arrow-datafusion/issues/1757.
   
    # Rationale for this change
   To help with benchmark automation and reporting, I would like the benchmark results to be written to a JSON file.
   
   # What changes are included in this PR?
   This PR adds a new `--output` argument to the tpch benchmark. When specified, a JSON summary file will be written to the specified directory, containing the benchmark results.
   
   ## Example JSON output
   
   ```json
   {
     "benchmark_version": "5.0.0",
     "datafusion_version": "6.0.0",
     "num_cpus": 48,
     "start_time": 1644167292,
     "arguments": [
       "benchmark",
       "datafusion",
       "--iterations",
       "1",
       "--path",
       "/mnt/bigdata/tpch/sf100-tbl",
       "--format",
       "tbl",
       "--query",
       "1",
       "--batch-size",
       "4096",
       "-o",
       "/tmp"
     ],
     "query": 1,
     "iterations": [
       {
         "elapsed": 210781.71820099998,
         "row_count": 4
       }
     ]
   }
   ```
   
   # Are there any user-facing changes?
   There is a new `--output` option when running the tpch benchmarks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on pull request #1766: TPC-H benchmark can optionally write JSON output file with benchmark summary

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #1766:
URL: https://github.com/apache/arrow-datafusion/pull/1766#issuecomment-1034091568


   Follow on PR that I needed to test this more easily: https://github.com/apache/arrow-datafusion/pull/1800


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Dandandan commented on a change in pull request #1766: TPC-H benchmark can optionally write JSON output file with benchmark summary

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #1766:
URL: https://github.com/apache/arrow-datafusion/pull/1766#discussion_r800393568



##########
File path: benchmarks/src/bin/tpch.rs
##########
@@ -359,6 +386,27 @@ async fn benchmark_ballista(opt: BallistaBenchmarkOpt) -> Result<()> {
     let avg = millis.iter().sum::<f64>() / millis.len() as f64;
     println!("Query {} avg time: {:.2} ms", opt.query, avg);
 
+    if let Some(path) = &opt.output_path {
+        write_summary_json(&mut benchmark_run, path)?;
+    }
+
+    Ok(())
+}
+
+fn write_summary_json(benchmark_run: &mut BenchmarkRun, path: &PathBuf) -> Result<()> {

Review comment:
       ```suggestion
   fn write_summary_json(benchmark_run: &mut BenchmarkRun, path: &Path) -> Result<()> {
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on pull request #1766: TPC-H benchmark can optionally write JSON output file with benchmark summary

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #1766:
URL: https://github.com/apache/arrow-datafusion/pull/1766#issuecomment-1034077209


   ```shell
   Query 1 iteration 2 took 62666.3 ms and returned 4 rows
   Query 1 avg time: 62579.50 ms
   Writing summary file to /tmp/tpch-q1-1644431672.json
   ```
   
   It is pretty neat:
   ```
   alamb@MacBook-Pro-2 arrow-datafusion % cat /tmp/tpch-q1-1644431672.json 
   {
     "benchmark_version": "5.0.0",
     "datafusion_version": "6.0.0",
     "num_cpus": 16,
     "start_time": 1644431672,
     "arguments": [
       "benchmark",
       "datafusion",
       "-o",
       "/tmp",
       "-p",
       "/Users/alamb/Software/tpch_data/SF1",
       "-q",
       "1",
       "--format",
       "tbl"
     ],
     "query": 1,
     "iterations": [
       {
         "elapsed": 62607.731700000004,
         "row_count": 4
       },
       {
         "elapsed": 62464.438148,
         "row_count": 4
       },
       {
         "elapsed": 62666.318697,
         "row_count": 4
       }
     ]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb merged pull request #1766: TPC-H benchmark can optionally write JSON output file with benchmark summary

Posted by GitBox <gi...@apache.org>.
alamb merged pull request #1766:
URL: https://github.com/apache/arrow-datafusion/pull/1766


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Dandandan commented on pull request #1766: TPC-H benchmark can optionally write JSON output file with benchmark summary

Posted by GitBox <gi...@apache.org>.
Dandandan commented on pull request #1766:
URL: https://github.com/apache/arrow-datafusion/pull/1766#issuecomment-1031373379


   @andygrove FYI I pushed a fix for a clippy linting error


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org