You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiao Li (JIRA)" <ji...@apache.org> on 2015/12/27 19:54:49 UTC
[jira] [Updated] (SPARK-12532) Join-key Pushdown via Predicate
Transitivity
[ https://issues.apache.org/jira/browse/SPARK-12532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiao Li updated SPARK-12532:
----------------------------
Description:
{code}
"SELECT * FROM upperCaseData JOIN lowerCaseData where lowerCaseData.n = upperCaseData.N and lowerCaseData.n = 3"
{code}
{code}
== Analyzed Logical Plan ==
N: int, L: string, n: int, l: string
Project [N#16,L#17,n#18,l#19]
+- Filter ((n#18 = N#16) && (n#18 = 3))
+- Join Inner, None
:- Subquery upperCaseData
: +- LogicalRDD [N#16,L#17], MapPartitionsRDD[17] at beforeAll at BeforeAndAfterAll.scala:187
+- Subquery lowerCaseData
+- LogicalRDD [n#18,l#19], MapPartitionsRDD[19] at beforeAll at BeforeAndAfterAll.scala:187
{code}
Before the improvement, the optimized logical plan is
{code}
== Optimized Logical Plan ==
Project [N#16,L#17,n#18,l#19]
+- Join Inner, Some((n#18 = N#16))
:- LogicalRDD [N#16,L#17], MapPartitionsRDD[17] at beforeAll at BeforeAndAfterAll.scala:187
+- Filter (n#18 = 3)
+- LogicalRDD [n#18,l#19], MapPartitionsRDD[19] at beforeAll at BeforeAndAfterAll.scala:187
{code}
After the improvement, the optimized logical plan should be like
{code}
== Optimized Logical Plan ==
Project [N#16,L#17,n#18,l#19]
+- Join Inner, Some((n#18 = N#16))
:- Filter (N#16 = 3)
: +- LogicalRDD [N#16,L#17], MapPartitionsRDD[17] at beforeAll at BeforeAndAfterAll.scala:187
+- Filter (n#18 = 3)
+- LogicalRDD [n#18,l#19], MapPartitionsRDD[19] at beforeAll at BeforeAndAfterAll.scala:187
{code}
was:
{code}
"SELECT * FROM upperCaseData JOIN lowerCaseData where lowerCaseData.n = upperCaseData.N and lowerCaseData.n = 3"
{code}
{code}
== Analyzed Logical Plan ==
N: int, L: string, n: int, l: string
Project [N#16,L#17,n#18,l#19]
+- Filter ((n#18 = N#16) && (n#18 = 3))
+- Join Inner, None
:- Subquery upperCaseData
: +- LogicalRDD [N#16,L#17], MapPartitionsRDD[17] at beforeAll at BeforeAndAfterAll.scala:187
+- Subquery lowerCaseData
+- LogicalRDD [n#18,l#19], MapPartitionsRDD[19] at beforeAll at BeforeAndAfterAll.scala:187
{code}
{code}
== Optimized Logical Plan ==
Project [N#16,L#17,n#18,l#19]
+- Join Inner, Some((n#18 = N#16))
:- Filter (N#16 = 3)
: +- LogicalRDD [N#16,L#17], MapPartitionsRDD[17] at beforeAll at BeforeAndAfterAll.scala:187
+- Filter (n#18 = 3)
+- LogicalRDD [n#18,l#19], MapPartitionsRDD[19] at beforeAll at BeforeAndAfterAll.scala:187
{code}
> Join-key Pushdown via Predicate Transitivity
> --------------------------------------------
>
> Key: SPARK-12532
> URL: https://issues.apache.org/jira/browse/SPARK-12532
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 1.6.0
> Reporter: Xiao Li
> Labels: SQL
>
> {code}
> "SELECT * FROM upperCaseData JOIN lowerCaseData where lowerCaseData.n = upperCaseData.N and lowerCaseData.n = 3"
> {code}
> {code}
> == Analyzed Logical Plan ==
> N: int, L: string, n: int, l: string
> Project [N#16,L#17,n#18,l#19]
> +- Filter ((n#18 = N#16) && (n#18 = 3))
> +- Join Inner, None
> :- Subquery upperCaseData
> : +- LogicalRDD [N#16,L#17], MapPartitionsRDD[17] at beforeAll at BeforeAndAfterAll.scala:187
> +- Subquery lowerCaseData
> +- LogicalRDD [n#18,l#19], MapPartitionsRDD[19] at beforeAll at BeforeAndAfterAll.scala:187
> {code}
> Before the improvement, the optimized logical plan is
> {code}
> == Optimized Logical Plan ==
> Project [N#16,L#17,n#18,l#19]
> +- Join Inner, Some((n#18 = N#16))
> :- LogicalRDD [N#16,L#17], MapPartitionsRDD[17] at beforeAll at BeforeAndAfterAll.scala:187
> +- Filter (n#18 = 3)
> +- LogicalRDD [n#18,l#19], MapPartitionsRDD[19] at beforeAll at BeforeAndAfterAll.scala:187
> {code}
> After the improvement, the optimized logical plan should be like
> {code}
> == Optimized Logical Plan ==
> Project [N#16,L#17,n#18,l#19]
> +- Join Inner, Some((n#18 = N#16))
> :- Filter (N#16 = 3)
> : +- LogicalRDD [N#16,L#17], MapPartitionsRDD[17] at beforeAll at BeforeAndAfterAll.scala:187
> +- Filter (n#18 = 3)
> +- LogicalRDD [n#18,l#19], MapPartitionsRDD[19] at beforeAll at BeforeAndAfterAll.scala:187
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org