You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/11 02:37:17 UTC

[GitHub] [arrow] sandflee opened a new issue, #12851: does arrow support parquet column index feathure?

sandflee opened a new issue, #12851:
URL: https://github.com/apache/arrow/issues/12851

   parquet has columnIndex to support page skiping (https://github.com/apache/parquet-format/blob/master/PageIndex.md), does arrow support it , and if not, any plan to support it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sandflee commented on issue #12851: does arrow support parquet column index feathure?

Posted by GitBox <gi...@apache.org>.
sandflee commented on issue #12851:
URL: https://github.com/apache/arrow/issues/12851#issuecomment-1097484828

   we're doing technical investigation about parquet and surprised to see arrow had support it,thanks for your detailed  and useful reply。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sandflee closed issue #12851: does arrow support parquet column index feathure?

Posted by GitBox <gi...@apache.org>.
sandflee closed issue #12851: does arrow support parquet column index feathure?
URL: https://github.com/apache/arrow/issues/12851


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace commented on issue #12851: does arrow support parquet column index feathure?

Posted by GitBox <gi...@apache.org>.
westonpace commented on issue #12851:
URL: https://github.com/apache/arrow/issues/12851#issuecomment-1095645729

   Regarding the C++ implementation (and by extension the python, R, and Ruby extensions): parquet-C++, the parquet library that is part of (and used by) arrow-c++, does have some support for serializing and deserializing these structures.
   
   However, Arrow's readers and writers for parquet do not (to the best of my knowledge) support using these indices for filter pushdown and do not have support for writing indices.
   
   Arrow is an open source project and so "any plan to support it" usually boils down to whether there is someone motivated enough with enough time to tackle the feature.  It is something I think would be a great addition.
   
   Adding the feature to the C++ implementation is tracked in [PARQUET-1404](https://issues.apache.org/jira/browse/PARQUET-1404) and [ARROW-10158](https://issues.apache.org/jira/browse/ARROW-10158)  There was an attempt to implement this referenced by those JIRA tickets but, unfortunately, it appears that work may have been abandoned.  There is a related mailing list discussion [here](https://lists.apache.org/thread/8skd7p6c1ohx16smptdoz462oc2fllb4).
   
   Adding the feature to the Rust implementation is tracked here https://github.com/apache/arrow-datafusion/issues/847


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org