You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/01/24 20:47:58 UTC

[GitHub] [arrow-datafusion] alamb opened a new issue #1670: Query Optimizer: Add OUTER --> INNER join conversion

alamb opened a new issue #1670:
URL: https://github.com/apache/arrow-datafusion/issues/1670


   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   
   LEFT OUTER, RIGHT OUTER, and FULL OUTER JOINs are often more expensive to evaluate and preclude other optimizations (such as pushing down predicates as can be seen in #1618) 
   
   As such, sophisticated optimizers will actually rewrite OUTER joins to INNER joins depending on the predicates of the query to improve performance
   
   
   **Describe the solution you'd like**
   
   Add an OptimzierPass pass that will attempt to convert OUTER joins to inner joins.
   
   This will require some non trivial research  to figure out under what conditions the joins can be rewritten / converted
   
   **Additional context**
   Relevant discussion: https://github.com/apache/arrow-datafusion/pull/1618#discussion_r790020079
   
   
   You can see a version of this code in Spark here: https://github.com/apache/spark/blob/aaf0e5e71509a2324e110e45366b753c7926c64b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala#L119-L135
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on issue #1670: Query Optimizer: Add OUTER --> INNER join conversion

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1670:
URL: https://github.com/apache/arrow-datafusion/issues/1670#issuecomment-1020534907


   Whoops -- this is a dupe of https://github.com/apache/arrow-datafusion/issues/1585


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on issue #1670: Query Optimizer: Add OUTER --> INNER join conversion

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1670:
URL: https://github.com/apache/arrow-datafusion/issues/1670#issuecomment-1020534907


   Whoops -- this is a dupe of https://github.com/apache/arrow-datafusion/issues/1585


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb closed issue #1670: Query Optimizer: Add OUTER --> INNER join conversion

Posted by GitBox <gi...@apache.org>.
alamb closed issue #1670:
URL: https://github.com/apache/arrow-datafusion/issues/1670


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb closed issue #1670: Query Optimizer: Add OUTER --> INNER join conversion

Posted by GitBox <gi...@apache.org>.
alamb closed issue #1670:
URL: https://github.com/apache/arrow-datafusion/issues/1670


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org