You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Jonathan Keane (Jira)" <ji...@apache.org> on 2021/09/14 19:14:00 UTC

[jira] [Created] (ARROW-13998) [C++] Add page skipping to parquet reading

Jonathan Keane created ARROW-13998:
--------------------------------------

             Summary: [C++] Add page skipping to parquet reading
                 Key: ARROW-13998
                 URL: https://issues.apache.org/jira/browse/ARROW-13998
             Project: Apache Arrow
          Issue Type: Improvement
          Components: C++
            Reporter: Jonathan Keane


bq. We don’t do data page skipping at all in parquet-cpp. We should add this to the short list of holistic improvements to the datasets infrastructure — we support row group skipping using column chunk statistics, but that is very coarse grained. data pages are much more fine-grained:
bq. https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L516







--
This message was sent by Atlassian Jira
(v8.3.4#803005)