You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Andy Grove (Jira)" <ji...@apache.org> on 2020/12/28 16:18:00 UTC
[jira] [Created] (ARROW-11047) [Rust] [DataFusion] ParquetTable
should avoid scanning all files twice
Andy Grove created ARROW-11047:
----------------------------------
Summary: [Rust] [DataFusion] ParquetTable should avoid scanning all files twice
Key: ARROW-11047
URL: https://issues.apache.org/jira/browse/ARROW-11047
Project: Apache Arrow
Issue Type: Improvement
Components: Rust - DataFusion
Reporter: Andy Grove
ParquetTable currently reads the metadata for all files once in the constructor in order to get the schema, and does it again each time scan() is called.
We could read the metadata once and cache it instead.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)