You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Allison Wang (Jira)" <ji...@apache.org> on 2021/09/13 20:26:00 UTC

[jira] [Created] (SPARK-36747) Do not collapse Project with Aggregate when correlated subqueries are present in the project list

Allison Wang created SPARK-36747:
------------------------------------

             Summary: Do not collapse Project with Aggregate when correlated subqueries are present in the project list
                 Key: SPARK-36747
                 URL: https://issues.apache.org/jira/browse/SPARK-36747
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 3.2.0
            Reporter: Allison Wang


Currently CollapseProject combines Project with Aggregate when the shared attributes are deterministic. But if there are correlated scalar subqueries in the project list that uses the output of the aggregate, they cannot be combined. Otherwise, the plan after rewrite will not be valid:
```
select (select sum(c2) from t where c1 = cast(s as int)) from (select sum(c2) s from t)

== Optimized Logical Plan ==
Aggregate [sum(b)#28L AS scalarsubquery(s)#29L]
+- Project [sum(b)#28L]
   +- Join LeftOuter, (a#20 = cast(sum(b#21) as int))
      :- LocalRelation [b#21]
      +- Aggregate [a#20], [sum(b#21) AS sum(b)#28L, a#20]
         +- LocalRelation [a#20, b#21]

java.lang.UnsupportedOperationException: Cannot generate code for expression: sum(input[0, int, false])
```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org