You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "rgwood (via GitHub)" <gi...@apache.org> on 2023/05/04 22:15:57 UTC

[GitHub] [arrow-datafusion] rgwood opened a new issue, #6237: `push_down_projection` optimization fails when using variables

rgwood opened a new issue, #6237:
URL: https://github.com/apache/arrow-datafusion/issues/6237

   ### Describe the bug
   
   Hello and thank you for writing+maintaining DataFusion!
   
   When upgrading from DataFusion 18, we noticed an unexpected warning that started in #4465 by @jackwener (which shipped in DataFusion 20):
   
   > WARN datafusion_optimizer::optimizer: Skipping optimizer rule 'push_down_projection' due to unexpected error: Error during planning: required columns can't push down
   
   ### To Reproduce
   
   I've put together a minimal Rust repro here: https://github.com/rgwood/df-repro
   
   In that repro, I use a [`VarProvider`](https://docs.rs/datafusion/latest/datafusion/variable/trait.VarProvider.html) to define an integer variable `@var`.
   
   I then use that variable in the WHERE clause of a SQL query: `SELECT foo FROM csv_table WHERE bar > @var`
   
   When evaluating the query, a warning indicates that `push_down_projection` was skipped:
   
   > WARN datafusion_optimizer::optimizer: Skipping optimizer rule 'push_down_projection' due to unexpected error: Error during planning: required columns can't push down, columns: {Column { relation: Some(Bare { table: "csv_table" }), name: "bar" }, Column { relation: None, name: "@var" }, Column { relation: Some(Bare { table: "csv_table" }), name: "foo" }}
   
   ### Expected behavior
   
   I expect the `push_down_projection` optimization to succeed without a warning.
   
   ### Additional context
   
   I'm relatively new to DataFusion, but it seems like `push_down_projection` is incorrectly interpreting the variable `@var` as a column when it should not be.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] rgwood commented on issue #6237: `push_down_projection` optimization fails when using variables

Posted by "rgwood (via GitHub)" <gi...@apache.org>.
rgwood commented on issue #6237:
URL: https://github.com/apache/arrow-datafusion/issues/6237#issuecomment-1536341666

   > would it be possible to provide a reproducer?
   
   Sure, I’ve put together a minimal reproduction here: https://github.com/rgwood/df-repro


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #6237: `push_down_projection` optimization fails when using variables

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6237:
URL: https://github.com/apache/arrow-datafusion/issues/6237#issuecomment-1536087792

   Thanks for the report @rgwood  -- @jackwener  does this sound familiar to you?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #6237: `push_down_projection` optimization fails when using variables

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6237:
URL: https://github.com/apache/arrow-datafusion/issues/6237#issuecomment-1536091564

   @rgwood  would it be possible to provide a reproducer? I tried briefly but it seems like we don't have any way to set these variables via SQL yet


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #6237: `push_down_projection` optimization fails when using variables

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6237:
URL: https://github.com/apache/arrow-datafusion/issues/6237#issuecomment-1538887702

   All the kudos to @jackwener  🙏 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] rgwood commented on issue #6237: `push_down_projection` optimization fails when using variables

Posted by "rgwood (via GitHub)" <gi...@apache.org>.
rgwood commented on issue #6237:
URL: https://github.com/apache/arrow-datafusion/issues/6237#issuecomment-1538647694

   Wow, that was a super quick response+fix. Thank you @jackwener and @alamb!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb closed issue #6237: `push_down_projection` optimization fails when using variables

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb closed issue #6237: `push_down_projection` optimization fails when using variables
URL: https://github.com/apache/arrow-datafusion/issues/6237


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] jackwener commented on issue #6237: `push_down_projection` optimization fails when using variables

Posted by "jackwener (via GitHub)" <gi...@apache.org>.
jackwener commented on issue #6237:
URL: https://github.com/apache/arrow-datafusion/issues/6237#issuecomment-1536351024

   Thank you @rgwood I will take a look.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org