You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "ianmcook (via GitHub)" <gi...@apache.org> on 2023/02/01 15:59:32 UTC
[GitHub] [arrow] ianmcook opened a new issue, #33985: [C++] Support specifying filters and projections with Substrait expressions
ianmcook opened a new issue, #33985:
URL: https://github.com/apache/arrow/issues/33985
### Describe the enhancement requested
In addition to representing full plans, Substrait can also be used to represent expressions (see https://github.com/substrait-io/substrait/pull/405/). It would be nice if Acero could consume Substrait expressions and use them to specify filters and projections.
I would love to see us expose functions that:
- Receive a Boolean-valued Substrait scalar expression and use it to add a Filter node to the ExecPlan
- Receive a list of Substrait scalar expressions and use it to add a Project node to the ExecPlan
### Component(s)
C++
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] westonpace commented on issue #33985: [C++] Support specifying filters and projections with Substrait expressions
Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace commented on issue #33985:
URL: https://github.com/apache/arrow/issues/33985#issuecomment-1435356988
You would need to use [this variant](https://arrow.apache.org/docs/cpp/api/dataset.html#_CPPv4N5arrow7dataset14ScannerBuilder7ProjectENSt6vectorIN7compute10ExpressionEEENSt6vectorINSt6stringEEE) of Project() but otherwise yes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] ianmcook commented on issue #33985: [C++] Support specifying filters and projections with Substrait expressions
Posted by "ianmcook (via GitHub)" <gi...@apache.org>.
ianmcook commented on issue #33985:
URL: https://github.com/apache/arrow/issues/33985#issuecomment-1435319522
@westonpace with the implementation you envision, could this also give us the ability to pass Substrait expressions to [`arrow::dataset::ScannerBuilder::Filter()`](https://arrow.apache.org/docs/cpp/api/dataset.html#_CPPv4N5arrow7dataset14ScannerBuilder6FilterERKN7compute10ExpressionE) and [`arrow::dataset::ScannerBuilder::Project()`](https://arrow.apache.org/docs/cpp/api/dataset.html#_CPPv4N5arrow7dataset14ScannerBuilder7ProjectENSt6vectorINSt6stringEEE)? That would be very, very cool indeed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] westonpace commented on issue #33985: [C++] Support specifying filters and projections with Substrait expressions
Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace commented on issue #33985:
URL: https://github.com/apache/arrow/issues/33985#issuecomment-1412409926
The Arrow equivalent of Substrait's expression is `arrow::compute::Expression`. So I think an extended expression proto file would roughly translate to `std::vector<std::pair<std::string, arrow::compute::Expression>>` (not actually suggesting we use this API, just describing).
If we added an API for that then those compute expressions could be used when building Acero filters & projects.
Note: this sort of implies the user is not using Substrait to express their actual queries. This is fine, we have non-Substrait APIs for filter & project in pyarrow (e.g. Array.filter) and R (dplyr) so there is certainly room for it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] westonpace closed issue #33985: [C++] Support specifying filters and projections with Substrait expressions
Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace closed issue #33985: [C++] Support specifying filters and projections with Substrait expressions
URL: https://github.com/apache/arrow/issues/33985
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] ianmcook commented on issue #33985: [C++] Support specifying filters and projections with Substrait expressions
Posted by "ianmcook (via GitHub)" <gi...@apache.org>.
ianmcook commented on issue #33985:
URL: https://github.com/apache/arrow/issues/33985#issuecomment-1412377054
This plus https://github.com/substrait-io/substrait-java/issues/128 would allow users to specify filters and projections as SQL expressions and execute them with Acero.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org