You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/11/09 22:20:13 UTC

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #4160: Minor: Extract parquet row group pruning code into its own module

alamb commented on code in PR #4160:
URL: https://github.com/apache/arrow-datafusion/pull/4160#discussion_r1018467749


##########
datafusion/core/src/physical_plan/file_format/parquet.rs:
##########
@@ -435,7 +423,7 @@ impl FileOpener for ParquetOpener {
             // Row group pruning: attempt to skip entire row_groups
             // using metadata on the row groups
             let file_metadata = builder.metadata();
-            let row_groups = prune_row_groups(
+            let row_groups = row_groups::prune_row_groups(

Review Comment:
   The whole point of the PR is to move this function (and its tests) into its own module



##########
datafusion/core/src/physical_plan/file_format/parquet.rs:
##########
@@ -17,43 +17,36 @@
 
 //! Execution plan for reading Parquet files
 
+use arrow::datatypes::SchemaRef;
 use fmt::Debug;

Review Comment:
   this file is still pretty massive, but after this PR it is down to under 2000 lines so that is progress I think



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org