You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/11/03 14:12:57 UTC

[GitHub] [arrow] sunchao commented on a change in pull request #8525: ARROW-10387: [Rust][Parquet] Avoid call for file size metadata to read footer

sunchao commented on a change in pull request #8525:
URL: https://github.com/apache/arrow/pull/8525#discussion_r516152984



##########
File path: rust/parquet/src/file/footer.rs
##########
@@ -44,30 +44,31 @@ use crate::schema::types::{self, SchemaDescriptor};
 /// The reader first reads DEFAULT_FOOTER_SIZE bytes from the end of the file.
 /// If it is not enough according to the length indicated in the footer, it reads more bytes.
 pub fn parse_metadata<R: ChunkReader>(chunk_reader: &R) -> Result<ParquetMetaData> {
-    // check file is large enough to hold footer
-    let file_size = chunk_reader.len();
-    if file_size < (FOOTER_SIZE as u64) {
+    // read and cache up to DEFAULT_FOOTER_READ_SIZE bytes from the end and process the footer
+    let mut first_end_read = chunk_reader.get_read(

Review comment:
       yes exactly, I feel the footer reader should not be aware of how the input stream is processed and also the logic can vary depending on the remote storage so the `DEFAULT_FOOTER_READ_SIZE` may not fit for all.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org