You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2023/05/19 09:32:43 UTC

[GitHub] [arrow-datafusion] alamb commented on issue #6383: Display all partitions and their files in EXPLAIN VERBOSE

alamb commented on issue #6383:
URL: https://github.com/apache/arrow-datafusion/issues/6383#issuecomment-1554300787

   cc @yahoNanJing and @crepererum 
   
   I think this is fairly well explained and would also be a good first issue for someone.
   
   To reproduce locally you could use something like:
   
   ```shell
   $ mkdir /tmp/foo
   $ for i in `seq 1 10`;  do  echo "1" > "/tmp/foo/data$i.csv"; done
   $ ls /tmp/foo/
   data1.csv   data10.csv  data2.csv   data3.csv   data4.csv   data5.csv   data6.csv   data7.csv   data8.csv   data9.csv
   
   $ datafusion-cli
   DataFusion CLI v24.0.0
   ❯ explain select * from '/tmp/foo';
   +---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | plan_type     | plan                                                                                                                                                                                                                                                                                                                                                                   |
   +---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | logical_plan  | TableScan: /tmp/foo projection=[1]                                                                                                                                                                                                                                                                                                                                     |
   | physical_plan | CsvExec: file_groups={10 groups: [[private/tmp/foo/data3.csv], [private/tmp/foo/data2.csv], [private/tmp/foo/data1.csv], [private/tmp/foo/data5.csv], [private/tmp/foo/data4.csv], [private/tmp/foo/data6.csv], [private/tmp/foo/data7.csv], [private/tmp/foo/data9.csv], [private/tmp/foo/data8.csv], [private/tmp/foo/data10.csv]]}, projection=[1], has_header=true |
   |               |                                                                                                                                                                                                                                                                                                                                                                        |
   +---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   2 rows in set. Query took 0.033 seconds.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org