You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "collimarco (via GitHub)" <gi...@apache.org> on 2023/06/20 16:07:07 UTC

[GitHub] [arrow-datafusion] collimarco opened a new issue, #6732: SQL on multiple parquet files doesn't work (returns ++ instead of result)

collimarco opened a new issue, #6732:
URL: https://github.com/apache/arrow-datafusion/issues/6732

   ### Describe the bug
   
   An SQL query on multiple Parquet files doesn't work.
   
   
   ### To Reproduce
   
   This is the code:
   
   ```
   use datafusion::datasource::file_format::file_type::{FileType, GetExt};
   use datafusion::datasource::file_format::parquet::ParquetFormat;
   use datafusion::datasource::listing::ListingOptions;
   use datafusion::error::Result;
   use datafusion::prelude::*;
   use std::sync::Arc;
   
   /// This example demonstrates executing a simple query against an Arrow data source (a directory
   /// with multiple Parquet files) and fetching results
   #[tokio::main]
   async fn main() -> Result<()> {
       let ctx = SessionContext::new();
       let testdata = "/Users/example/Desktop/data_bucket";
       let file_format = ParquetFormat::default().with_enable_pruning(Some(true));
       let listing_options = ListingOptions::new(Arc::new(file_format))
           .with_file_extension(FileType::PARQUET.get_ext());
   
       ctx.register_listing_table(
           "my_table",
           &format!("file://{testdata}"),
           listing_options,
           None,
           None,
       )
       .await
       .unwrap();
   
       let df = ctx
           .sql("SELECT * FROM my_table LIMIT 10")
           .await?;
   
       df.show().await?;
   
       Ok(())
   }
   ```
   
   And you need a directory with some Parquet files ("/Users/example/Desktop/data_bucket").
   
   ### Expected behavior
   
   You get a result and some rows are printed
   
   ### Additional context
   
   You only get:
   
   ```
   ++
   ++
   ```
   
   The build and execution however is successful, meaning that no error is displayed.
   
   I am using MacOS for testing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] jiangzhx commented on issue #6732: SQL on multiple parquet files doesn't work (returns ++ instead of result)

Posted by "jiangzhx (via GitHub)" <gi...@apache.org>.
jiangzhx commented on issue #6732:
URL: https://github.com/apache/arrow-datafusion/issues/6732#issuecomment-1606510999

   > There is another case (3) where the command fails silently.
   > 
   > You need to call `ctx.register_listing_table` with this:
   > 
   > ```
   > "/Users/example/Desktop/my_bucket/parquet/"
   > ```
   > 
   > And not this:
   > 
   > ```
   > "/Users/example/Desktop/my_bucket/parquet"
   > ```
   > 
   > (_note the forward slash at the end of the path to indicate a directory_)
   
   Ending with a forward slash or not, it's all working fine for me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] collimarco commented on issue #6732: SQL on multiple parquet files doesn't work (returns ++ instead of result)

Posted by "collimarco (via GitHub)" <gi...@apache.org>.
collimarco commented on issue #6732:
URL: https://github.com/apache/arrow-datafusion/issues/6732#issuecomment-1606924516

   > Ending with a forward slash or not, it's all working fine for me.
   
   @jiangzhx I am testing on MacOS and this bug seems present, maybe you are using another OS.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] collimarco commented on issue #6732: SQL on multiple parquet files doesn't work (returns ++ instead of result)

Posted by "collimarco (via GitHub)" <gi...@apache.org>.
collimarco commented on issue #6732:
URL: https://github.com/apache/arrow-datafusion/issues/6732#issuecomment-1604598185

   @alamb Here's the result:
   
   ```
   +------------------------------------------------------------+-------------------------------------------------+
   | plan_type                                                  | plan                                            |
   +------------------------------------------------------------+-------------------------------------------------+
   | initial_logical_plan                                       | Limit: skip=0, fetch=10                         |
   |                                                            |   Projection:                                   |
   |                                                            |     TableScan: my_table                         |
   | logical_plan after inline_table_scan                       | SAME TEXT AS ABOVE                              |
   | logical_plan after type_coercion                           | SAME TEXT AS ABOVE                              |
   | logical_plan after count_wildcard_rule                     | SAME TEXT AS ABOVE                              |
   | analyzed_logical_plan                                      | SAME TEXT AS ABOVE                              |
   | logical_plan after simplify_expressions                    | SAME TEXT AS ABOVE                              |
   | logical_plan after unwrap_cast_in_comparison               | SAME TEXT AS ABOVE                              |
   | logical_plan after replace_distinct_aggregate              | SAME TEXT AS ABOVE                              |
   | logical_plan after eliminate_join                          | SAME TEXT AS ABOVE                              |
   | logical_plan after decorrelate_predicate_subquery          | SAME TEXT AS ABOVE                              |
   | logical_plan after scalar_subquery_to_join                 | SAME TEXT AS ABOVE                              |
   | logical_plan after extract_equijoin_predicate              | SAME TEXT AS ABOVE                              |
   | logical_plan after simplify_expressions                    | SAME TEXT AS ABOVE                              |
   | logical_plan after merge_projection                        | SAME TEXT AS ABOVE                              |
   | logical_plan after rewrite_disjunctive_predicate           | SAME TEXT AS ABOVE                              |
   | logical_plan after eliminate_duplicated_expr               | SAME TEXT AS ABOVE                              |
   | logical_plan after eliminate_filter                        | SAME TEXT AS ABOVE                              |
   | logical_plan after eliminate_cross_join                    | SAME TEXT AS ABOVE                              |
   | logical_plan after common_sub_expression_eliminate         | SAME TEXT AS ABOVE                              |
   | logical_plan after eliminate_limit                         | SAME TEXT AS ABOVE                              |
   | logical_plan after propagate_empty_relation                | SAME TEXT AS ABOVE                              |
   | logical_plan after filter_null_join_keys                   | SAME TEXT AS ABOVE                              |
   | logical_plan after eliminate_outer_join                    | SAME TEXT AS ABOVE                              |
   | logical_plan after push_down_limit                         | Projection:                                     |
   |                                                            |   Limit: skip=0, fetch=10                       |
   |                                                            |     TableScan: my_table, fetch=10               |
   | logical_plan after push_down_filter                        | SAME TEXT AS ABOVE                              |
   | logical_plan after single_distinct_aggregation_to_group_by | SAME TEXT AS ABOVE                              |
   | logical_plan after simplify_expressions                    | SAME TEXT AS ABOVE                              |
   | logical_plan after unwrap_cast_in_comparison               | SAME TEXT AS ABOVE                              |
   | logical_plan after common_sub_expression_eliminate         | SAME TEXT AS ABOVE                              |
   | logical_plan after push_down_projection                    | Limit: skip=0, fetch=10                         |
   |                                                            |   Projection:                                   |
   |                                                            |     TableScan: my_table projection=[], fetch=10 |
   | logical_plan after eliminate_projection                    | Limit: skip=0, fetch=10                         |
   |                                                            |   TableScan: my_table projection=[], fetch=10   |
   | logical_plan after push_down_limit                         | SAME TEXT AS ABOVE                              |
   | logical_plan after simplify_expressions                    | SAME TEXT AS ABOVE                              |
   | logical_plan after unwrap_cast_in_comparison               | SAME TEXT AS ABOVE                              |
   | logical_plan after replace_distinct_aggregate              | SAME TEXT AS ABOVE                              |
   | logical_plan after eliminate_join                          | SAME TEXT AS ABOVE                              |
   | logical_plan after decorrelate_predicate_subquery          | SAME TEXT AS ABOVE                              |
   | logical_plan after scalar_subquery_to_join                 | SAME TEXT AS ABOVE                              |
   | logical_plan after extract_equijoin_predicate              | SAME TEXT AS ABOVE                              |
   | logical_plan after simplify_expressions                    | SAME TEXT AS ABOVE                              |
   | logical_plan after merge_projection                        | SAME TEXT AS ABOVE                              |
   | logical_plan after rewrite_disjunctive_predicate           | SAME TEXT AS ABOVE                              |
   | logical_plan after eliminate_duplicated_expr               | SAME TEXT AS ABOVE                              |
   | logical_plan after eliminate_filter                        | SAME TEXT AS ABOVE                              |
   | logical_plan after eliminate_cross_join                    | SAME TEXT AS ABOVE                              |
   | logical_plan after common_sub_expression_eliminate         | SAME TEXT AS ABOVE                              |
   | logical_plan after eliminate_limit                         | SAME TEXT AS ABOVE                              |
   | logical_plan after propagate_empty_relation                | SAME TEXT AS ABOVE                              |
   | logical_plan after filter_null_join_keys                   | SAME TEXT AS ABOVE                              |
   | logical_plan after eliminate_outer_join                    | SAME TEXT AS ABOVE                              |
   | logical_plan after push_down_limit                         | SAME TEXT AS ABOVE                              |
   | logical_plan after push_down_filter                        | SAME TEXT AS ABOVE                              |
   | logical_plan after single_distinct_aggregation_to_group_by | SAME TEXT AS ABOVE                              |
   | logical_plan after simplify_expressions                    | SAME TEXT AS ABOVE                              |
   | logical_plan after unwrap_cast_in_comparison               | SAME TEXT AS ABOVE                              |
   | logical_plan after common_sub_expression_eliminate         | SAME TEXT AS ABOVE                              |
   | logical_plan after push_down_projection                    | SAME TEXT AS ABOVE                              |
   | logical_plan after eliminate_projection                    | SAME TEXT AS ABOVE                              |
   | logical_plan after push_down_limit                         | SAME TEXT AS ABOVE                              |
   | logical_plan                                               | Limit: skip=0, fetch=10                         |
   |                                                            |   TableScan: my_table projection=[], fetch=10   |
   | initial_physical_plan                                      | GlobalLimitExec: skip=0, fetch=10               |
   |                                                            |   EmptyExec: produce_one_row=false              |
   |                                                            |                                                 |
   | physical_plan after aggregate_statistics                   | SAME TEXT AS ABOVE                              |
   | physical_plan after join_selection                         | SAME TEXT AS ABOVE                              |
   | physical_plan after PipelineFixer                          | SAME TEXT AS ABOVE                              |
   | physical_plan after repartition                            | SAME TEXT AS ABOVE                              |
   | physical_plan after global_sort_selection                  | SAME TEXT AS ABOVE                              |
   | physical_plan after EnforceDistribution                    | SAME TEXT AS ABOVE                              |
   | physical_plan after CombinePartialFinalAggregate           | SAME TEXT AS ABOVE                              |
   | physical_plan after EnforceSorting                         | SAME TEXT AS ABOVE                              |
   | physical_plan after coalesce_batches                       | SAME TEXT AS ABOVE                              |
   | physical_plan after PipelineChecker                        | SAME TEXT AS ABOVE                              |
   | physical_plan                                              | GlobalLimitExec: skip=0, fetch=10               |
   |                                                            |   EmptyExec: produce_one_row=false              |
   |                                                            |                                                 |
   +------------------------------------------------------------+-------------------------------------------------+
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] jiangzhx commented on issue #6732: SQL on multiple parquet files doesn't work (returns ++ instead of result)

Posted by "jiangzhx (via GitHub)" <gi...@apache.org>.
jiangzhx commented on issue #6732:
URL: https://github.com/apache/arrow-datafusion/issues/6732#issuecomment-1600428592

   My previous guess was that the file extension was incorrect, but it doesn't seem to be the case. 
   Could you please print out the explain analyze? It would be great if you could also find a way to upload a Parquet file."
   ```
       let df = ctx
           .sql("SELECT * FROM my_table ")
           .await?
           .explain(false, true)?;
   ```
   or
   ```
   explain analyze SELECT * FROM my_table
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] collimarco commented on issue #6732: SQL on multiple parquet files doesn't work (returns ++ instead of result)

Posted by "collimarco (via GitHub)" <gi...@apache.org>.
collimarco commented on issue #6732:
URL: https://github.com/apache/arrow-datafusion/issues/6732#issuecomment-1606196676

   That's really strange. I am also running `EXPLAIN VERBOSE SELECT * FROM my_table LIMIT 10` and the output doesn't even lists the file names.
   
   However testing on another directory of Parquet files it works.
   
   Maybe it's one of the following:
   1. If there are some other files (not `.parquet`) in the same directory, that causes the issue
   2. if some Parquet files in the directory have a different schema, that causes the issue
   
   In any case **it should throw some errors or display some debug messages in the above cases, and not fail silently**! Otherwise debugging any issue will become a nightmare without a clear message that detects the exact problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] collimarco commented on issue #6732: SQL on multiple parquet files doesn't work (returns ++ instead of result)

Posted by "collimarco (via GitHub)" <gi...@apache.org>.
collimarco commented on issue #6732:
URL: https://github.com/apache/arrow-datafusion/issues/6732#issuecomment-1600353057

   @jiangzhx Sure. If I run `ls -al ./data_bucket` I only have the following Parquet files in the directory: `0.prod.parquet`, `1.prod.parquet`, `2.prod.parquet` ... `9.prod.parquet`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] collimarco commented on issue #6732: SQL on multiple parquet files doesn't work (returns ++ instead of result)

Posted by "collimarco (via GitHub)" <gi...@apache.org>.
collimarco commented on issue #6732:
URL: https://github.com/apache/arrow-datafusion/issues/6732#issuecomment-1600482147

   @jiangzhx I get this:
   
   ```
   +-------------------+---------------------------------------------------------------------------------+
   | plan_type         | plan                                                                            |
   +-------------------+---------------------------------------------------------------------------------+
   | Plan with Metrics | GlobalLimitExec: skip=0, fetch=10, metrics=[output_rows=0, elapsed_compute=1ns] |
   |                   |   EmptyExec: produce_one_row=false, metrics=[]                                  |
   |                   |                                                                                 |
   +-------------------+---------------------------------------------------------------------------------+
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] SQL on multiple parquet files doesn't work (returns ++ instead of result) [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6732:
URL: https://github.com/apache/arrow-datafusion/issues/6732#issuecomment-1785789206

   Possibly the same as https://github.com/apache/arrow-datafusion/issues/7954


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] jiangzhx commented on issue #6732: SQL on multiple parquet files doesn't work (returns ++ instead of result)

Posted by "jiangzhx (via GitHub)" <gi...@apache.org>.
jiangzhx commented on issue #6732:
URL: https://github.com/apache/arrow-datafusion/issues/6732#issuecomment-1600880001

   I suspect that there might be an issue with the Parquet file, because I can query the test data normally using the code you provided.
   you can find testdata at https://github.com/apache/parquet-testing/tree/a11fc8f148f8a7a89d9281cc0da3eb9d56095fbf/data
   
   - alltypes_plain.parquet
   - alltypes_plain.snappy.parquet
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] collimarco commented on issue #6732: SQL on multiple parquet files doesn't work (returns ++ instead of result)

Posted by "collimarco (via GitHub)" <gi...@apache.org>.
collimarco commented on issue #6732:
URL: https://github.com/apache/arrow-datafusion/issues/6732#issuecomment-1606210417

   There is another case (3) where the command fails silently.
   
   You need to call `ctx.register_listing_table` with this:
   
   ```
   "/Users/example/Desktop/my_bucket/parquet/"
   ```
   
   And not this:
   
   ```
   "/Users/example/Desktop/my_bucket/parquet"
   ```
   
   (*note the forward slash at the end of the path to indicate a directory*)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] jiangzhx commented on issue #6732: SQL on multiple parquet files doesn't work (returns ++ instead of result)

Posted by "jiangzhx (via GitHub)" <gi...@apache.org>.
jiangzhx commented on issue #6732:
URL: https://github.com/apache/arrow-datafusion/issues/6732#issuecomment-1606510558

   > 2. if some Parquet files in the directory have a different schema, that causes the issue
   
   nice digg. I think this is the key reason for this situation. I have faced a similar scenario before.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] jiangzhx commented on issue #6732: SQL on multiple parquet files doesn't work (returns ++ instead of result)

Posted by "jiangzhx (via GitHub)" <gi...@apache.org>.
jiangzhx commented on issue #6732:
URL: https://github.com/apache/arrow-datafusion/issues/6732#issuecomment-1600028006

   Could you please list the files in the data_bucket?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] collimarco commented on issue #6732: SQL on multiple parquet files doesn't work (returns ++ instead of result)

Posted by "collimarco (via GitHub)" <gi...@apache.org>.
collimarco commented on issue #6732:
URL: https://github.com/apache/arrow-datafusion/issues/6732#issuecomment-1600943343

   @jiangzhx 
   
   The files are valid. 
   
   I can easily open them with an editor like https://github.com/antonycourtney/tad and they are displayed properly. I have generated them with [Apache Arrow Ruby](https://github.com/apache/arrow/tree/main/ruby):
   
   ```ruby
   arrow_table = Arrow::Table.new(table)
   arrow_table.save("example.parquet", compression: :zstd)
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] jiangzhx commented on issue #6732: SQL on multiple parquet files doesn't work (returns ++ instead of result)

Posted by "jiangzhx (via GitHub)" <gi...@apache.org>.
jiangzhx commented on issue #6732:
URL: https://github.com/apache/arrow-datafusion/issues/6732#issuecomment-1606514619

   @collimarco Maybe we could add some documentation here to make it clear when to use Datafusion for querying parquet files.
   https://github.com/apache/arrow-datafusion/blame/8c7678a0b10d14029b5c34f822dc2605e00390fd/docs/source/user-guide/example-usage.md#L35


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #6732: SQL on multiple parquet files doesn't work (returns ++ instead of result)

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6732:
URL: https://github.com/apache/arrow-datafusion/issues/6732#issuecomment-1604519046

   ```
   +-------------------+---------------------------------------------------------------------------------+
   | plan_type         | plan                                                                            |
   +-------------------+---------------------------------------------------------------------------------+
   | Plan with Metrics | GlobalLimitExec: skip=0, fetch=10, metrics=[output_rows=0, elapsed_compute=1ns] |
   |                   |   EmptyExec: produce_one_row=false, metrics=[]                                  |
   |                   |                                                                                 |
   +-------------------+---------------------------------------------------------------------------------+
   ```
   
   This plan indicates that DataFusion thinks there are no rows in the file and has replaced the actual scan with an `EmptyExec`
   
   Would it be possible to run a `explain verbose SELECT * FROM my_table` or `explain analyze verbose SELECT * FROM my_table` and post the output?
   
   This would tell us if the issue is related to something about how the files are being found, or if it something related to statistics, for example, that are being written
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] jiangzhx commented on issue #6732: SQL on multiple parquet files doesn't work (returns ++ instead of result)

Posted by "jiangzhx (via GitHub)" <gi...@apache.org>.
jiangzhx commented on issue #6732:
URL: https://github.com/apache/arrow-datafusion/issues/6732#issuecomment-1606098012

   @collimarco 
   I tested on the "alltypes_plain.parquet" file and got the following output from "explain verbose":
   You can see
   - the projection with all column names.
   - also last line of "file_groups" specifying the location of the file.
   
   
   
   ```
   +------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | plan_type                                                  | plan                                                                                                                                                                                                                                                                                                                                                                                                           |
   +------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | initial_logical_plan                                       | Projection: my_table.id, my_table.bool_col, my_table.tinyint_col, my_table.smallint_col, my_table.int_col, my_table.bigint_col, my_table.float_col, my_table.double_col, my_table.date_string_col, my_table.string_col, my_table.timestamp_col                                                                                                                                                                 |
   |                                                            |   TableScan: my_table                                                                                                                                                                                                                                                                                                                                                                                          |
   | logical_plan after inline_table_scan                       | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after type_coercion                           | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after count_wildcard_rule                     | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | analyzed_logical_plan                                      | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after simplify_expressions                    | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after unwrap_cast_in_comparison               | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after replace_distinct_aggregate              | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after eliminate_join                          | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after decorrelate_predicate_subquery          | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after scalar_subquery_to_join                 | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after extract_equijoin_predicate              | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after simplify_expressions                    | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after merge_projection                        | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after rewrite_disjunctive_predicate           | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after eliminate_duplicated_expr               | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after eliminate_filter                        | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after eliminate_cross_join                    | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after common_sub_expression_eliminate         | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after eliminate_limit                         | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after propagate_empty_relation                | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after filter_null_join_keys                   | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after eliminate_outer_join                    | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after push_down_limit                         | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after push_down_filter                        | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after single_distinct_aggregation_to_group_by | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after simplify_expressions                    | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after unwrap_cast_in_comparison               | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after common_sub_expression_eliminate         | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after push_down_projection                    | Projection: my_table.id, my_table.bool_col, my_table.tinyint_col, my_table.smallint_col, my_table.int_col, my_table.bigint_col, my_table.float_col, my_table.double_col, my_table.date_string_col, my_table.string_col, my_table.timestamp_col                                                                                                                                                                 |
   |                                                            |   TableScan: my_table projection=[id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, float_col, double_col, date_string_col, string_col, timestamp_col]                                                                                                                                                                                                                                             |
   | logical_plan after eliminate_projection                    | TableScan: my_table projection=[id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, float_col, double_col, date_string_col, string_col, timestamp_col]                                                                                                                                                                                                                                               |
   | logical_plan after push_down_limit                         | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after simplify_expressions                    | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after unwrap_cast_in_comparison               | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after replace_distinct_aggregate              | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after eliminate_join                          | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after decorrelate_predicate_subquery          | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after scalar_subquery_to_join                 | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after extract_equijoin_predicate              | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after simplify_expressions                    | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after merge_projection                        | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after rewrite_disjunctive_predicate           | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after eliminate_duplicated_expr               | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after eliminate_filter                        | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after eliminate_cross_join                    | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after common_sub_expression_eliminate         | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after eliminate_limit                         | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after propagate_empty_relation                | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after filter_null_join_keys                   | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after eliminate_outer_join                    | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after push_down_limit                         | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after push_down_filter                        | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after single_distinct_aggregation_to_group_by | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after simplify_expressions                    | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after unwrap_cast_in_comparison               | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after common_sub_expression_eliminate         | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after push_down_projection                    | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after eliminate_projection                    | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan after push_down_limit                         | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | logical_plan                                               | TableScan: my_table projection=[id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, float_col, double_col, date_string_col, string_col, timestamp_col]                                                                                                                                                                                                                                               |
   | initial_physical_plan                                      | ParquetExec: file_groups={1 group: [[Users/sylar/workspace/opensource/arrow-datafusion/parquet-testing/data/multiple_snappy_files/1.prod.parquet, Users/sylar/workspace/opensource/arrow-datafusion/parquet-testing/data/multiple_snappy_files/2.prod.parquet]]}, projection=[id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, float_col, double_col, date_string_col, string_col, timestamp_col] |
   |                                                            |                                                                                                                                                                                                                                                                                                                                                                                                                |
   | physical_plan after aggregate_statistics                   | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | physical_plan after join_selection                         | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | physical_plan after PipelineFixer                          | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | physical_plan after repartition                            | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | physical_plan after global_sort_selection                  | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | physical_plan after EnforceDistribution                    | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | physical_plan after CombinePartialFinalAggregate           | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | physical_plan after EnforceSorting                         | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | physical_plan after coalesce_batches                       | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | physical_plan after PipelineChecker                        | SAME TEXT AS ABOVE                                                                                                                                                                                                                                                                                                                                                                                             |
   | physical_plan                                              | ParquetExec: file_groups={1 group: [[Users/sylar/workspace/opensource/arrow-datafusion/parquet-testing/data/multiple_snappy_files/1.prod.parquet, Users/sylar/workspace/opensource/arrow-datafusion/parquet-testing/data/multiple_snappy_files/2.prod.parquet]]}, projection=[id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, float_col, double_col, date_string_col, string_col, timestamp_col] |
   |                                                            |                                                                                                                                                                                                                                                                                                                                                                                                                |
   +------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org