You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/14 21:32:15 UTC
[GitHub] [arrow-rs] ParadoxShmaradox commented on issue #1191: Parquet Scan Filter
ParadoxShmaradox commented on issue #1191:
URL: https://github.com/apache/arrow-rs/issues/1191#issuecomment-1126814290
Just wanted to chip in and say that implementing PageIndex would be great even if parquet-rs doesn't use it internally for predicate push down as it can be used by other engines/implementation that do. In analytics systems these files get passed around between different systems.
I'm currently rewriting a project in Java, to Rust, that uses parquet 1.11.1 (so no hadoop or Impala or Spark) which uses predicate pushdown using ColumnIndex and OffsetIndex.
I'm using parquet-rs to write the file and datafusion to read the file back, unfortunately the current system can't read the file due the absent of the PageIndex.
Having current systems to be able to read parquet-rs files would be a boon for backward/forward compatibly.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org