You are viewing a plain text version of this content. The canonical link for it is here.

Posted to jira@arrow.apache.org by "Andy Grove (Jira)" <ji...@apache.org> on 2020/12/24 19:07:00 UTC

[jira] [Assigned] (ARROW-10620) [Rust][Parquet] move column chunk range logic to metadata.rs

     [ https://issues.apache.org/jira/browse/ARROW-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andy Grove reassigned ARROW-10620:
----------------------------------

    Assignee: Remi Dettai

> [Rust][Parquet] move column chunk range logic to metadata.rs
> ------------------------------------------------------------
>
>                 Key: ARROW-10620
>                 URL: https://issues.apache.org/jira/browse/ARROW-10620
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Rust
>            Reporter: Remi Dettai
>            Assignee: Remi Dettai
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 3.0.0
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> Getting the range of bytes of a column chunk inside a parquet file can be useful for external crates (for instance if they want to pre-fetch the columns), and is not completely obvious (it is enough to take a look at [1] and [2] to see that things can quickly get messy). 
> I think it would be nice to move this logic in the metadata definition rather than have lost it in the middle of the reader implem. 
> [1] https://stackoverflow.com/questions/55225108/why-is-dictionary-page-offset-0-for-plain-dictionary-encoding/
> [2] https://issues.apache.org/jira/browse/PARQUET-816



--
This message was sent by Atlassian Jira
(v8.3.4#803005)