You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Alec Mocatta (JIRA)" <ji...@apache.org> on 2019/01/21 22:00:02 UTC

[jira] [Created] (ARROW-4314) Strongly-typed reading of Parquet data

Alec Mocatta created ARROW-4314:
-----------------------------------

             Summary: Strongly-typed reading of Parquet data
                 Key: ARROW-4314
                 URL: https://issues.apache.org/jira/browse/ARROW-4314
             Project: Apache Arrow
          Issue Type: New Feature
          Components: Rust
            Reporter: Alec Mocatta


See the proposal I made onĀ [~csun]'s repository [here|https://github.com/sunchao/parquet-rs/issues/205] for more details.

This aims to let the user opt in to strong typing and substantial performance improvements (2x-7x, see [here|https://github.com/sunchao/parquet-rs/issues/205#issuecomment-446016254]) by optionally specifying the type of the records that they are iterating over.

It is currently a work in progress. All pre-existing tests succeed, bar those in src/record/api.rs which are commented out as they require reworking. Where relevant, pre-existing tests and benchmarks have been duplicated to make new strongly-typed tests and benchmarks, which all also succeed. I've tried to maintain pre-existing APIs where possible. Some changes have been made to better align with prior art in the Rust ecosystem.

Any feedback while I continue working on it very welcome! Looking forward to hopefully seeing this merged when it's ready.

[https://github.com/alecmocatta/arrow]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)