You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/27 19:08:23 UTC

[GitHub] [arrow-datafusion] roeap commented on issue #1313: [Question] Usage of async object store APIs in consuming code

roeap commented on issue #1313:
URL: https://github.com/apache/arrow-datafusion/issues/1313#issuecomment-1111378407

   @houqp @yjshen - it's been a while and since then a lot has happened :). Specifically we have async support now in `arros-rs` as well and I was hoping to see if I could help implementing async support for the ObjectReader in this end as well. Looking into what that might entail I stubled across a few questions and I was hoping you could provide some guidance if I understood things correctly.
   
   The trait in here uses `AsyncRead` from futures, while the parquet implementation uses the tokio traits. Seeing that tokio already is a non optional dependency in data-access, would it be Ok to change that API? 
   
   In contrast to the sync api, a method for a full reader does not exist (yet?) for the async case. My thinking right now to approach this would be to switch to using the tokio traits to be consistent with the parquet reader (an alternative to use one of the compat methods) and add a reader method somewhat like this .. 
   
   ```rust
   async fn reader(&self) -> Result<Box<dyn AsyncRead + AsyncSeek + Unpin>>;
   ```
   
   and then try applying `ParquetRecordBatchStream` to simplify `ParquetExec`. (maybe in a second PR)
   
   Is this on the right track?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org