You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/11 22:20:20 UTC

[GitHub] [arrow] westonpace commented on issue #12851: does arrow support parquet column index feathure?

westonpace commented on issue #12851:
URL: https://github.com/apache/arrow/issues/12851#issuecomment-1095645729

   Regarding the C++ implementation (and by extension the python, R, and Ruby extensions): parquet-C++, the parquet library that is part of (and used by) arrow-c++, does have some support for serializing and deserializing these structures.
   
   However, Arrow's readers and writers for parquet do not (to the best of my knowledge) support using these indices for filter pushdown and do not have support for writing indices.
   
   Arrow is an open source project and so "any plan to support it" usually boils down to whether there is someone motivated enough with enough time to tackle the feature.  It is something I think would be a great addition.
   
   Adding the feature to the C++ implementation is tracked in [PARQUET-1404](https://issues.apache.org/jira/browse/PARQUET-1404) and [ARROW-10158](https://issues.apache.org/jira/browse/ARROW-10158)  There was an attempt to implement this referenced by those JIRA tickets but, unfortunately, it appears that work may have been abandoned.  There is a related mailing list discussion [here](https://lists.apache.org/thread/8skd7p6c1ohx16smptdoz462oc2fllb4).
   
   Adding the feature to the Rust implementation is tracked here https://github.com/apache/arrow-datafusion/issues/847


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org