You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2021/09/03 00:18:00 UTC

[jira] [Commented] (SPARK-36656) CollapseProject should not collapse correlated scalar subqueries

    [ https://issues.apache.org/jira/browse/SPARK-36656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17409173#comment-17409173 ] 

Apache Spark commented on SPARK-36656:
--------------------------------------

User 'allisonwang-db' has created a pull request for this issue:
https://github.com/apache/spark/pull/33903

> CollapseProject should not collapse correlated scalar subqueries
> ----------------------------------------------------------------
>
>                 Key: SPARK-36656
>                 URL: https://issues.apache.org/jira/browse/SPARK-36656
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 3.3.0
>            Reporter: Allison Wang
>            Priority: Major
>
> Currently, the optimizer rule `CollapseProject` inlines expressions generated from correlated scalar subqueries, which can create unnecessary left outer joins.
> {code:sql}
> select c1, s, s * 10 from (
> select c1, (select first(c2) from t2 where t1.c1 = t2.c1) s from t1)
> {code}
> {code:scala}
> // Before
> Project [c1, s, (s * 10)]
> +- Project [c1, scalar-subquery [c1] AS s]
>    :  +- Aggregate [c1], [first(c2), c1] 
>    :      +- LocalRelation [c1, c2]
>    +- LocalRelation [c1, c2]
> // After (scalar subqueries are inlined)
> Project [c1, scalar-subquery [c1], (scalar-subquery [c1] * 10)]
> :  +- Aggregate [c1], [first(c2), c1] 
> :      +- LocalRelation [c1, c2]
> :  +- Aggregate [c1], [first(c2), c1] 
> :      +- LocalRelation [c1, c2]
> +- LocalRelation [c1, c2]
> {code}
> Then this query will have two LeftOuter joins created. We should only collapse projects after correlated subqueries are rewritten as joins.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org