You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiao Li (JIRA)" <ji...@apache.org> on 2017/03/01 15:58:45 UTC

[jira] [Updated] (SPARK-19766) INNER JOIN on constant alias columns return incorrect results

     [ https://issues.apache.org/jira/browse/SPARK-19766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xiao Li updated SPARK-19766:
----------------------------
    Fix Version/s: 2.2.0
                   2.1.1

> INNER JOIN on constant alias columns return incorrect results
> -------------------------------------------------------------
>
>                 Key: SPARK-19766
>                 URL: https://issues.apache.org/jira/browse/SPARK-19766
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.1.0
>            Reporter: StanZhai
>            Priority: Critical
>              Labels: Correctness
>             Fix For: 2.1.1, 2.2.0
>
>
> We can demonstrate the problem with the following data set and query:
> {code}
> val spark = SparkSession.builder().appName("test").master("local").getOrCreate()
> val sql1 =
>   """
>     |create temporary view t1 as select * from values
>     |(1)
>     |as grouping(a)
>   """.stripMargin
> val sql2 =
>   """
>     |create temporary view t2 as select * from values
>     |(1)
>     |as grouping(a)
>   """.stripMargin
> val sql3 =
>   """
>     |create temporary view t3 as select * from values
>     |(1),
>     |(1)
>     |as grouping(a)
>   """.stripMargin
> val sql4 =
>   """
>     |create temporary view t4 as select * from values
>     |(1),
>     |(1)
>     |as grouping(a)
>   """.stripMargin
> val sqlA =
>   """
>     |create temporary view ta as
>     |select a, 'a' as tag from t1 union all
>     |select a, 'b' as tag from t2
>   """.stripMargin
> val sqlB =
>   """
>     |create temporary view tb as
>     |select a, 'a' as tag from t3 union all
>     |select a, 'b' as tag from t4
>   """.stripMargin
> val sql =
>   """
>     |select tb.* from ta inner join tb on
>     |ta.a = tb.a and
>     |ta.tag = tb.tag
>   """.stripMargin
> spark.sql(sql1)
> spark.sql(sql2)
> spark.sql(sql3)
> spark.sql(sql4)
> spark.sql(sqlA)
> spark.sql(sqlB)
> spark.sql(sql).show()
> {code}
> The results which is incorrect:
> {code}
> +---+---+
> |  a|tag|
> +---+---+
> |  1|  b|
> |  1|  b|
> |  1|  a|
> |  1|  a|
> |  1|  b|
> |  1|  b|
> |  1|  a|
> |  1|  a|
> +---+---+
> {code}
> The correct results should be:
> {code}
> +---+---+
> |  a|tag|
> +---+---+
> |  1|  a|
> |  1|  a|
> |  1|  b|
> |  1|  b|
> +---+---+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org