You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2023/04/24 17:20:45 UTC

[GitHub] [arrow-datafusion] alamb commented on issue #4177: Automatically detect and use "is the data sorted" information in parquet file metadata

alamb commented on issue #4177:
URL: https://github.com/apache/arrow-datafusion/issues/4177#issuecomment-1520553333

   Some part of the parquet file metadata is already read as part of physical planning (e.g. fetching the statistics). I don't quite remember how it is all hooked up but you can trace it back from 
   
   https://github.com/apache/arrow-datafusion/blob/729586258fe6371e394b8b2caa4e1b55eccbf6c5/datafusion/core/src/physical_plan/file_format/parquet.rs#L154
   
   That might give one a sense of how we could use the sortedness information in DataFusion without doing more work


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org