You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/08/01 13:20:11 UTC

[GitHub] [arrow-datafusion] Ted-Jiang opened a new issue, #2999: Wrong metric when using EXPLAIN ANALYZE

Ted-Jiang opened a new issue, #2999:
URL: https://github.com/apache/arrow-datafusion/issues/2999

   **Describe the bug**
   submit a query 
   ```
   
   ❯ select sum("30"), "29" from test1 group by "29";
   
   198683 rows in set. Query took 0.436 seconds.
   ```
   
   then use `explain analyze `
   ```
   explain analyze  select sum("30"), "29" from test1 group by "29";
   +-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | plan_type         | plan                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
   +-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | Plan with Metrics | CoalescePartitionsExec, metrics=[output_rows=198683, elapsed_compute=47.871µs, spill_count=0, spilled_bytes=0, mem_used=0]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
   |                   |   ProjectionExec: expr=[SUM(test1.30)@1 as SUM(test1.30), 29@0 as 29], metrics=[output_rows=198683, elapsed_compute=103.299µs, spill_count=0, spilled_bytes=0, mem_used=0]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
   |                   |     AggregateExec: mode=FinalPartitioned, gby=[29@0 as 29], aggr=[SUM(test1.30)], metrics=[output_rows=198683, elapsed_compute=514.502772ms, spill_count=0, spilled_bytes=0, mem_used=0]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
   |                   |       CoalesceBatchesExec: target_batch_size=4096, metrics=[output_rows=972749, elapsed_compute=91.069241ms, spill_count=0, spilled_bytes=0, mem_used=0]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
   |                   |         RepartitionExec: partitioning=Hash([Column { name: "29", index: 0 }], 100), metrics=[fetch_time{inputPartition=0}=5.544007548s, repart_time{inputPartition=0}=75.322775ms, send_time{inputPartition=0}=14.473058ms]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
   |                   |           AggregateExec: mode=Partial, gby=[29@1 as 29], aggr=[SUM(test1.30)], metrics=[output_rows=972749, elapsed_compute=1.915277419s, spill_count=0, spilled_bytes=0, mem_used=0]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
   |                   |             RepartitionExec: partitioning=RoundRobinBatch(100), metrics=[send_time{inputPartition=0}=243.407µs, fetch_time{inputPartition=0}=43.401471ms, repart_time{inputPartition=0}=1ns]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
   |                   |               ParquetExec: limit=None, projection=[30, 29], metrics=[output_rows=1000202, elapsed_compute=13.253088ms, spill_count=0, spilled_bytes=0, mem_used=0, num_predicate_creation_errors=0, row_groups_pruned{filename=kylin/rno_native_dev/kylin_metadata_test/ssb/parquet/kylin_udaf_cube/19900101000000_20000101000000_ACG/17179869183/part-00005-84c1f179-eda5-4947-86ca-3d2497b485c4-c000.snappy.parquet}=0, bytes_scanned{filename=kylin/rno_native_dev/kylin_metadata_test/ssb/parquet/kylin_udaf_cube/19900101000000_20000101000000_ACG/17179869183/part-00005-84c1f179-eda5-4947-86ca-3d2497b485c4-c000.snappy.parquet}=5959033, predicate_evaluation_errors{filename=kylin/rno_native_dev/kylin_metadata_test/ssb/parquet/kylin_udaf_cube/19900101000000_20000101000000_ACG/17179869183/part-00005-84c1f179-eda5-4947-86ca-3d2497b485c4-c000.snappy.parquet}=0] |
   |                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
   +-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   1 row in set. Query took 0.093 seconds.
   ```
   the first `elapsed_compute=1.915277419s` in AGG is even longer than the query cose 😂
   
   **To Reproduce**
   Steps to reproduce the behavior:
   
   **Expected behavior**
   A clear and concise description of what you expected to happen.
   
   **Additional context**
   Add any other context about the problem here.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #2999: Wrong metric when using EXPLAIN ANALYZE

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #2999:
URL: https://github.com/apache/arrow-datafusion/issues/2999#issuecomment-1203956102

   I think `elapsed_compute` is meant to capture how much cpu time was taken -- so if the query ran on 16 cores for 1 second of wall clock time, then I would expect that elapsed_compute would be reported as `16 sec`
   
   That being said, it is hard to imagine how 0.093 seconds query time could use 1.9 sec of compute 🤔 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org