You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yuming Wang (Jira)" <ji...@apache.org> on 2021/03/20 08:35:00 UTC
[jira] [Created] (SPARK-34807) Push down filter through window
after TransposeWindow
Yuming Wang created SPARK-34807:
-----------------------------------
Summary: Push down filter through window after TransposeWindow
Key: SPARK-34807
URL: https://issues.apache.org/jira/browse/SPARK-34807
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 3.2.0
Reporter: Yuming Wang
{code:scala}
spark.range(10).selectExpr("id AS a", "id AS b", "id AS c", "id AS d").createTempView("t1")
val df = spark.sql(
"""
|SELECT *
| FROM (
| SELECT b,
| sum(d) OVER (PARTITION BY a, b),
| rank() OVER (PARTITION BY a ORDER BY c)
| FROM t1
| ) v1
|WHERE b = 2
|""".stripMargin)
{code}
Current optimized plan:
{noformat}
== Optimized Logical Plan ==
Project [b#221L, sum(d) OVER (PARTITION BY a, b ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)#231L, RANK() OVER (PARTITION BY a ORDER BY c ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)#232]
+- Filter (b#221L = 2)
+- Window [rank(c#222L) windowspecdefinition(a#220L, c#222L ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS RANK() OVER (PARTITION BY a ORDER BY c ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)#232], [a#220L], [c#222L ASC NULLS FIRST]
+- Project [b#221L, a#220L, c#222L, sum(d) OVER (PARTITION BY a, b ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)#231L]
+- Window [sum(d#223L) windowspecdefinition(a#220L, b#221L, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS sum(d) OVER (PARTITION BY a, b ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)#231L], [a#220L, b#221L]
+- Project [id#218L AS b#221L, id#218L AS d#223L, id#218L AS a#220L, id#218L AS c#222L]
+- Range (0, 10, step=1, splits=Some(2))
{noformat}
Expected optimized plan:
{noformat}
== Optimized Logical Plan ==
Project [b#221L, sum(d) OVER (PARTITION BY a, b ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)#231L, RANK() OVER (PARTITION BY a ORDER BY c ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)#232]
+- Window [sum(d#223L) windowspecdefinition(a#220L, b#221L, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS sum(d) OVER (PARTITION BY a, b ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)#231L], [a#220L, b#221L]
+- Project [b#221L, d#223L, a#220L, RANK() OVER (PARTITION BY a ORDER BY c ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)#232]
+- Filter (b#221L = 2)
+- Window [rank(c#222L) windowspecdefinition(a#220L, c#222L ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS RANK() OVER (PARTITION BY a ORDER BY c ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)#232], [a#220L], [c#222L ASC NULLS FIRST]
+- Project [id#218L AS b#221L, id#218L AS d#223L, id#218L AS a#220L, id#218L AS c#222L]
+- Range (0, 10, step=1, splits=Some(2))
{noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org