You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "kylebrooks-8451 (via GitHub)" <gi...@apache.org> on 2023/05/15 13:02:12 UTC
[GitHub] [arrow-datafusion-python] kylebrooks-8451 commented on issue #362: Use pyarrow.substrait to execute scans on Pyarrow Datasets
kylebrooks-8451 commented on issue #362:
URL: https://github.com/apache/arrow-datafusion-python/issues/362#issuecomment-1547819988
Hi @wjones127 - Thanks for reaching out, I would vote in favor of adding in Substrait expressions to the Dataset API. I think it would allow plugging in other execution engines into PyArrow and standardize converting plans between frameworks. I read through the issue you linked, I think what we really want here is option 1 from that thread:
An interface for consuming data from a dataset-like object, without having to be a pyarrow.dataset.Dataset (or Scanner) instance.
If Substrait gets us there then great.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org