You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/14 21:32:15 UTC

[GitHub] [arrow-rs] ParadoxShmaradox commented on issue #1191: Parquet Scan Filter

ParadoxShmaradox commented on issue #1191:
URL: https://github.com/apache/arrow-rs/issues/1191#issuecomment-1126814290

   Just wanted to chip in and say that implementing PageIndex would be great even if parquet-rs doesn't use it internally for predicate push down as it can be used by other engines/implementation that do. In analytics systems these files get passed around between different systems.
   
   I'm currently rewriting a project in Java, to Rust, that uses parquet 1.11.1 (so no hadoop or Impala or Spark) which uses predicate pushdown using ColumnIndex and OffsetIndex.
   
   I'm using parquet-rs to write the file and datafusion to read the file back, unfortunately the current system can't read the file due the absent of the PageIndex.
   
   Having current systems to be able to read parquet-rs files would be a boon for backward/forward compatibly.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org