You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/12/22 11:24:26 UTC

[GitHub] [arrow] jorisvandenbossche edited a comment on pull request #6303: ARROW-8039: [Python] Use dataset API in existing parquet readers and tests

jorisvandenbossche edited a comment on pull request #6303:
URL: https://github.com/apache/arrow/pull/6303#issuecomment-621937113


   I finally listed the open TODO items from the discussions in this PR / the skipped tests, and opened JIRAs where this was not yet the case:
   
   - [ ] Deduplicating the specified column names (https://github.com/apache/arrow/pull/6303#discussion_r397350410): do we actually want this? 
   - [x] Support buffers/NativeFile as file source -> https://issues.apache.org/jira/browse/ARROW-8074
   - [ ] Multithreaded discovery -> https://issues.apache.org/jira/browse/ARROW-8137
   - [x] Deterministic row order -> https://issues.apache.org/jira/browse/ARROW-8447
   - [ ] Error on "bad" files instead of skipping -> https://issues.apache.org/jira/browse/ARROW-7673
   - [ ] Partition fields using dictionary type -> opened https://issues.apache.org/jira/browse/ARROW-8647 (but should this be on by default?)
   - [ ] Metadata support: https://issues.apache.org/jira/browse/ARROW-8062 is an issue about discovery from metadata files, but in addition we also need a way to expose the metadata/statistics
   - [ ] Support pickling -> opened https://issues.apache.org/jira/browse/ARROW-8651
   - [ ] Comment about needing better error message when encountering invalid files: this seems to work now, opened https://issues.apache.org/jira/browse/ARROW-8652 to enable the test
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org