You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiao Li (JIRA)" <ji...@apache.org> on 2015/12/27 19:54:49 UTC
[jira] [Updated] (SPARK-12532) Join-key Pushdown via Predicate Transitivity

     [ https://issues.apache.org/jira/browse/SPARK-12532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xiao Li updated SPARK-12532:
----------------------------
    Description: 
{code}
"SELECT * FROM upperCaseData JOIN lowerCaseData where lowerCaseData.n = upperCaseData.N and lowerCaseData.n = 3"
{code}
{code}
== Analyzed Logical Plan ==
N: int, L: string, n: int, l: string
Project [N#16,L#17,n#18,l#19]
+- Filter ((n#18 = N#16) && (n#18 = 3))
   +- Join Inner, None
      :- Subquery upperCaseData
      :  +- LogicalRDD [N#16,L#17], MapPartitionsRDD[17] at beforeAll at BeforeAndAfterAll.scala:187
      +- Subquery lowerCaseData
         +- LogicalRDD [n#18,l#19], MapPartitionsRDD[19] at beforeAll at BeforeAndAfterAll.scala:187
{code}
Before the improvement, the optimized logical plan is
{code}
== Optimized Logical Plan ==
Project [N#16,L#17,n#18,l#19]
+- Join Inner, Some((n#18 = N#16))
   :- LogicalRDD [N#16,L#17], MapPartitionsRDD[17] at beforeAll at BeforeAndAfterAll.scala:187
   +- Filter (n#18 = 3)
      +- LogicalRDD [n#18,l#19], MapPartitionsRDD[19] at beforeAll at BeforeAndAfterAll.scala:187
{code}
After the improvement, the optimized logical plan should be like
{code}
== Optimized Logical Plan ==
Project [N#16,L#17,n#18,l#19]
+- Join Inner, Some((n#18 = N#16))
   :- Filter (N#16 = 3)
   :  +- LogicalRDD [N#16,L#17], MapPartitionsRDD[17] at beforeAll at BeforeAndAfterAll.scala:187
   +- Filter (n#18 = 3)
      +- LogicalRDD [n#18,l#19], MapPartitionsRDD[19] at beforeAll at BeforeAndAfterAll.scala:187
{code}

  was:
{code}
"SELECT * FROM upperCaseData JOIN lowerCaseData where lowerCaseData.n = upperCaseData.N and lowerCaseData.n = 3"
{code}
{code}
== Analyzed Logical Plan ==
N: int, L: string, n: int, l: string
Project [N#16,L#17,n#18,l#19]
+- Filter ((n#18 = N#16) && (n#18 = 3))
   +- Join Inner, None
      :- Subquery upperCaseData
      :  +- LogicalRDD [N#16,L#17], MapPartitionsRDD[17] at beforeAll at BeforeAndAfterAll.scala:187
      +- Subquery lowerCaseData
         +- LogicalRDD [n#18,l#19], MapPartitionsRDD[19] at beforeAll at BeforeAndAfterAll.scala:187
{code}
{code}
== Optimized Logical Plan ==
Project [N#16,L#17,n#18,l#19]
+- Join Inner, Some((n#18 = N#16))
   :- Filter (N#16 = 3)
   :  +- LogicalRDD [N#16,L#17], MapPartitionsRDD[17] at beforeAll at BeforeAndAfterAll.scala:187
   +- Filter (n#18 = 3)
      +- LogicalRDD [n#18,l#19], MapPartitionsRDD[19] at beforeAll at BeforeAndAfterAll.scala:187
{code}


> Join-key Pushdown via Predicate Transitivity
> --------------------------------------------
>
>                 Key: SPARK-12532
>                 URL: https://issues.apache.org/jira/browse/SPARK-12532
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.6.0
>            Reporter: Xiao Li
>              Labels: SQL
>
> {code}
> "SELECT * FROM upperCaseData JOIN lowerCaseData where lowerCaseData.n = upperCaseData.N and lowerCaseData.n = 3"
> {code}
> {code}
> == Analyzed Logical Plan ==
> N: int, L: string, n: int, l: string
> Project [N#16,L#17,n#18,l#19]
> +- Filter ((n#18 = N#16) && (n#18 = 3))
>    +- Join Inner, None
>       :- Subquery upperCaseData
>       :  +- LogicalRDD [N#16,L#17], MapPartitionsRDD[17] at beforeAll at BeforeAndAfterAll.scala:187
>       +- Subquery lowerCaseData
>          +- LogicalRDD [n#18,l#19], MapPartitionsRDD[19] at beforeAll at BeforeAndAfterAll.scala:187
> {code}
> Before the improvement, the optimized logical plan is
> {code}
> == Optimized Logical Plan ==
> Project [N#16,L#17,n#18,l#19]
> +- Join Inner, Some((n#18 = N#16))
>    :- LogicalRDD [N#16,L#17], MapPartitionsRDD[17] at beforeAll at BeforeAndAfterAll.scala:187
>    +- Filter (n#18 = 3)
>       +- LogicalRDD [n#18,l#19], MapPartitionsRDD[19] at beforeAll at BeforeAndAfterAll.scala:187
> {code}
> After the improvement, the optimized logical plan should be like
> {code}
> == Optimized Logical Plan ==
> Project [N#16,L#17,n#18,l#19]
> +- Join Inner, Some((n#18 = N#16))
>    :- Filter (N#16 = 3)
>    :  +- LogicalRDD [N#16,L#17], MapPartitionsRDD[17] at beforeAll at BeforeAndAfterAll.scala:187
>    +- Filter (n#18 = 3)
>       +- LogicalRDD [n#18,l#19], MapPartitionsRDD[19] at beforeAll at BeforeAndAfterAll.scala:187
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org