You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/11/02 17:38:44 UTC

[GitHub] [arrow-datafusion] alamb opened a new issue, #4089: Simplify small `InList` expressions

alamb opened a new issue, #4089:
URL: https://github.com/apache/arrow-datafusion/issues/4089

   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   You can write a query like
   
   ```sql
   select ... from foo where id in (4)
   ```
   
   And that sql is oftentimes  made by tools that handle some number of ids;
   
   We have a specialized InList implementation (e.g. see https://github.com/apache/arrow-datafusion/pull/4057) but for single values it is still faster to use a standard equality predicate
   
   **Describe the solution you'd like**
   
   As mentioned by @jackwener and @Dandandan in https://github.com/apache/arrow-datafusion/pull/4057#discussion_ we should rewrite inlist with a few elements.
   
   We should definitely simplify `<left> IN (<expr>)` to `<left> = <expr>` as that will be better in all cases. 
   
   **Describe alternatives you've considered**
   
   We could potentially also rewrite `<left> IN (<expr>, <expr2>, .. <exprN>)` to `<left> = <expr> OR <left> = <expr2> OR .. <left> = <exprN>`
   
   However, at some point the InList expression is faster to evaluate, and that break even point depends on the cost to evaluate `<left>`  . Thus I suggest we only rewrite for single value IN lists
   
   
   
   **Additional context**
   This is a good first issue because there are several examples of the code and tests to follow
   
   You can find simplify rules here: https://github.com/apache/arrow-datafusion/blob/10e64dc013ba210ab1f6c2a3c02c66aef4a0e802/datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs#L329-L339
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] Dandandan closed issue #4089: Simplify small `InList` expressions

Posted by GitBox <gi...@apache.org>.
Dandandan closed issue #4089: Simplify small `InList` expressions
URL: https://github.com/apache/arrow-datafusion/issues/4089


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org