You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "rgwood (via GitHub)" <gi...@apache.org> on 2023/05/04 22:15:57 UTC
[GitHub] [arrow-datafusion] rgwood opened a new issue, #6237: `push_down_projection` optimization fails when using variables
rgwood opened a new issue, #6237:
URL: https://github.com/apache/arrow-datafusion/issues/6237
### Describe the bug
Hello and thank you for writing+maintaining DataFusion!
When upgrading from DataFusion 18, we noticed an unexpected warning that started in #4465 by @jackwener (which shipped in DataFusion 20):
> WARN datafusion_optimizer::optimizer: Skipping optimizer rule 'push_down_projection' due to unexpected error: Error during planning: required columns can't push down
### To Reproduce
I've put together a minimal Rust repro here: https://github.com/rgwood/df-repro
In that repro, I use a [`VarProvider`](https://docs.rs/datafusion/latest/datafusion/variable/trait.VarProvider.html) to define an integer variable `@var`.
I then use that variable in the WHERE clause of a SQL query: `SELECT foo FROM csv_table WHERE bar > @var`
When evaluating the query, a warning indicates that `push_down_projection` was skipped:
> WARN datafusion_optimizer::optimizer: Skipping optimizer rule 'push_down_projection' due to unexpected error: Error during planning: required columns can't push down, columns: {Column { relation: Some(Bare { table: "csv_table" }), name: "bar" }, Column { relation: None, name: "@var" }, Column { relation: Some(Bare { table: "csv_table" }), name: "foo" }}
### Expected behavior
I expect the `push_down_projection` optimization to succeed without a warning.
### Additional context
I'm relatively new to DataFusion, but it seems like `push_down_projection` is incorrectly interpreting the variable `@var` as a column when it should not be.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] rgwood commented on issue #6237: `push_down_projection` optimization fails when using variables
Posted by "rgwood (via GitHub)" <gi...@apache.org>.
rgwood commented on issue #6237:
URL: https://github.com/apache/arrow-datafusion/issues/6237#issuecomment-1536341666
> would it be possible to provide a reproducer?
Sure, I’ve put together a minimal reproduction here: https://github.com/rgwood/df-repro
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb commented on issue #6237: `push_down_projection` optimization fails when using variables
Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6237:
URL: https://github.com/apache/arrow-datafusion/issues/6237#issuecomment-1536087792
Thanks for the report @rgwood -- @jackwener does this sound familiar to you?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb commented on issue #6237: `push_down_projection` optimization fails when using variables
Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6237:
URL: https://github.com/apache/arrow-datafusion/issues/6237#issuecomment-1536091564
@rgwood would it be possible to provide a reproducer? I tried briefly but it seems like we don't have any way to set these variables via SQL yet
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb commented on issue #6237: `push_down_projection` optimization fails when using variables
Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6237:
URL: https://github.com/apache/arrow-datafusion/issues/6237#issuecomment-1538887702
All the kudos to @jackwener 🙏
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] rgwood commented on issue #6237: `push_down_projection` optimization fails when using variables
Posted by "rgwood (via GitHub)" <gi...@apache.org>.
rgwood commented on issue #6237:
URL: https://github.com/apache/arrow-datafusion/issues/6237#issuecomment-1538647694
Wow, that was a super quick response+fix. Thank you @jackwener and @alamb!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb closed issue #6237: `push_down_projection` optimization fails when using variables
Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb closed issue #6237: `push_down_projection` optimization fails when using variables
URL: https://github.com/apache/arrow-datafusion/issues/6237
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] jackwener commented on issue #6237: `push_down_projection` optimization fails when using variables
Posted by "jackwener (via GitHub)" <gi...@apache.org>.
jackwener commented on issue #6237:
URL: https://github.com/apache/arrow-datafusion/issues/6237#issuecomment-1536351024
Thank you @rgwood I will take a look.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org