You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "kylebrooks-8451 (via GitHub)" <gi...@apache.org> on 2023/05/15 13:02:12 UTC

[GitHub] [arrow-datafusion-python] kylebrooks-8451 commented on issue #362: Use pyarrow.substrait to execute scans on Pyarrow Datasets

kylebrooks-8451 commented on issue #362:
URL: https://github.com/apache/arrow-datafusion-python/issues/362#issuecomment-1547819988

   Hi @wjones127 - Thanks for reaching out, I would vote in favor of adding in Substrait expressions to the Dataset API. I think it would allow plugging in other execution engines into PyArrow and standardize converting plans between frameworks. I read through the issue you linked, I think what we really want here is option 1 from that thread:
   
   An interface for consuming data from a dataset-like object, without having to be a pyarrow.dataset.Dataset (or Scanner) instance.
   
   If Substrait gets us there then great.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org