You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Allison Wang (Jira)" <ji...@apache.org> on 2022/02/09 05:32:00 UTC

[jira] [Created] (SPARK-38155) Disallow distinct aggregate in lateral subqueries with unsupported correlated predicates

Allison Wang created SPARK-38155:
------------------------------------

             Summary: Disallow distinct aggregate in lateral subqueries with unsupported correlated predicates
                 Key: SPARK-38155
                 URL: https://issues.apache.org/jira/browse/SPARK-38155
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 3.3.0
            Reporter: Allison Wang


Block lateral subqueries in CheckAnalysis that contains DISTINCT aggregate and correlated non-equality predicates. This can lead to incorrect results as DISTINCT will be rewritten as Aggregate during the optimization phase.

For example

CREATE VIEW t1(c1, c2) AS VALUES (0, 1)

CREATE VIEW t2(c1, c2) AS VALUES (1, 2), (2, 2)

SELECT * FROM t1 JOIN LATERAL (SELECT DISTINCT c2 FROM t2 WHERE c1 > t1.c1)

The correct results should be (0, 1, 2) but currently, it gives  [(0, 1, 2), (0, 1, 2)]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org