You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/07/26 13:41:23 UTC

[GitHub] [arrow-datafusion] alamb opened a new issue #779: Implement EXPLAIN ANALYZE

alamb opened a new issue #779:
URL: https://github.com/apache/arrow-datafusion/issues/779


   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   Now that we have `EXPLAIN <query>` we know what plan DataFusion *will* execute. However, there is no particularly easy way to see what actually did happen (e.g how many rows were actually read / filtered by each operator).
   
   **Describe the solution you'd like**
   I would like to extend DataFusion's EXPLAIN functionality to also include the ability to actually run the plan, capture metrics, and display them
   
   I imagine something like the following (adding the `executed_plan` row)
   ```
   > EXPLAIN ANALYZE SELECT * from foo;
   +---------------+--------------------------------------------------------------------------+
   | plan_type     | plan                                                                     |
   +---------------+--------------------------------------------------------------------------+
   | logical_plan  | Projection: #foo.x                                                       |
   |               |   TableScan: foo projection=Some([0])                                    |
   | physical_plan | ProjectionExec: expr=[x@0 as x]                                          |
   |               |   RepartitionExec: partitioning=RoundRobinBatch(16)                      |
   |               |     CsvExec: source=Path(/tmp/foo.csv: [/tmp/foo.csv]), has_header=false |
   | executed_plan | ProjectionExec:  num_rows=2 exec_ms=6                                       |
   |               |   RepartitionExec:  num_rows=2 exec_ms=4                    |
   |               |     CsvExec: num_rows=2, exec_ms=300  |
   +---------------+--------------------------------------------------------------------------+
   ```
   2 rows in set. Query took 0.002 seconds.
   
   **Additional context**
   We probably need something like https://github.com/apache/arrow-datafusion/issues/679 completed prior to doing this?
   
   cc @Dandandan  and @andygrove 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on issue #779: Implement EXPLAIN ANALYZE

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #779:
URL: https://github.com/apache/arrow-datafusion/issues/779#issuecomment-886715077


   There is some prior work by @andygrove  here: https://github.com/apache/arrow-datafusion/pull/662 (added the `with_metrics` for displayable physical plans) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb closed issue #779: Implement EXPLAIN ANALYZE

Posted by GitBox <gi...@apache.org>.
alamb closed issue #779:
URL: https://github.com/apache/arrow-datafusion/issues/779


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org