You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/16 18:45:54 UTC

[GitHub] [arrow-datafusion] brianrackle opened a new issue #1458: Registering IPC Source into ExecutionContext or Loading Into DataFrame

brianrackle opened a new issue #1458:
URL: https://github.com/apache/arrow-datafusion/issues/1458


   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   It would be convenient if there was a straightforward way to register and create a dataframe from an IPC formatted file. There are convenience functions for doing this with all other file types but IPC doesnt seem to have any such support. Unless I am missing something IPC can be loaded from a file into a set of RecordBatches but there is does not seem to be a straightforward way to load those into an ExecutionContext.
   
   **Describe the solution you'd like**
   Convenience functions for querying IPC data similar to register/read csv/parquet/avro
   
   **Describe alternatives you've considered**
   Figuring out how to use object store or listing tables to get IPC files into the execution context, but I am still trying to figure out how to do this.
   
   **Additional context**
   I am still new so maybe I am missing something obvious. Maybe an example in the examples dir would be sufficient to convey how to work with IPC data sources.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] brianrackle commented on issue #1458: Registering IPC Source into ExecutionContext or Loading Into DataFrame

Posted by GitBox <gi...@apache.org>.
brianrackle commented on issue #1458:
URL: https://github.com/apache/arrow-datafusion/issues/1458#issuecomment-1002811625


   Doing this through MemTable but would like to figure out how to stream from the file since the way I understand is supposed to be one benefit of arrow files over parquet.
   
   Using MemTable gist:
   https://gist.github.com/brianrackle/f1e6e74615759ae906a55b89dc91e59f


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org