You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Andy Grove (Jira)" <ji...@apache.org> on 2020/12/28 16:18:00 UTC

[jira] [Created] (ARROW-11047) [Rust] [DataFusion] ParquetTable should avoid scanning all files twice

Andy Grove created ARROW-11047:
----------------------------------

             Summary: [Rust] [DataFusion] ParquetTable should avoid scanning all files twice
                 Key: ARROW-11047
                 URL: https://issues.apache.org/jira/browse/ARROW-11047
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Rust - DataFusion
            Reporter: Andy Grove


ParquetTable currently reads the metadata for all files once in the constructor in order to get the schema, and does it again each time scan() is called.

We could read the metadata once and cache it instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)