You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Jayjeet Chakraborty (Jira)" <ji...@apache.org> on 2021/06/16 19:56:00 UTC

[jira] [Commented] (ARROW-12921) [C++][Dataset] Add RadosParquetFileFormat to Dataset API

    [ https://issues.apache.org/jira/browse/ARROW-12921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17364502#comment-17364502 ] 

Jayjeet Chakraborty commented on ARROW-12921:
---------------------------------------------

Hello all, Just wanted to follow up regarding this issue if anyone has got the chance to look at the PR. It would be great if any reviewer can be assigned to the Pull request and we would like to work with them on this.

> [C++][Dataset] Add RadosParquetFileFormat to Dataset API
> --------------------------------------------------------
>
>                 Key: ARROW-12921
>                 URL: https://issues.apache.org/jira/browse/ARROW-12921
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++, Continuous Integration, Documentation, Python
>            Reporter: Jayjeet Chakraborty
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Implement a RadosParquetFileFormat class to defer the evaluation of scan operations on a Parquet dataset to a RADOS storage backend. This can be done by using the librados C++ library to execute storage side functions that scan the files on the Ceph storage nodes (OSDs) using Arrow libraries. This issue is an upgrade to the previous story of ARROW-10549. See corresponding [mailing list|https://lists.apache.org/thread.html/r2a5a693967213b7c6bb49015194ca16afc4d20047805d0e069c2e45c%40%3Cdev.arrow.apache.org%3E] discussion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)