You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/08/13 06:46:36 UTC

[GitHub] [arrow-rs] Ted-Jiang commented on a diff in pull request #2435: Add ParquetRecordBatchReaderBuilder (#2427)

Ted-Jiang commented on code in PR #2435:
URL: https://github.com/apache/arrow-rs/pull/2435#discussion_r945093466


##########
parquet/src/arrow/arrow_reader/mod.rs:
##########
@@ -84,10 +204,14 @@ pub trait ArrowReader {
     ) -> Result<Self::RecordReader>;
 }
 
+/// Options that control how metadata is read for a parquet file
+///
+/// See [`ArrowReaderBuilder`] for how to configure how the column data
+/// is then read from the file, including projection and filter pushdown
 #[derive(Debug, Clone, Default)]
 pub struct ArrowReaderOptions {
     skip_arrow_metadata: bool,
-    selection: Option<RowSelection>,
+    page_index: bool,

Review Comment:
   Sounds reasonable.
   I think read page_index should belong to the open file, have you find out how many times read page_index cost🤔



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org