You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2017/03/02 06:23:45 UTC
[jira] [Commented] (SPARK-19766) INNER JOIN on constant alias
columns return incorrect results
[ https://issues.apache.org/jira/browse/SPARK-19766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15891692#comment-15891692 ]
Apache Spark commented on SPARK-19766:
--------------------------------------
User 'stanzhai' has created a pull request for this issue:
https://github.com/apache/spark/pull/17131
> INNER JOIN on constant alias columns return incorrect results
> -------------------------------------------------------------
>
> Key: SPARK-19766
> URL: https://issues.apache.org/jira/browse/SPARK-19766
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.1.0
> Reporter: StanZhai
> Assignee: StanZhai
> Priority: Critical
> Labels: Correctness
> Fix For: 2.1.1, 2.2.0
>
>
> We can demonstrate the problem with the following data set and query:
> {code}
> val spark = SparkSession.builder().appName("test").master("local").getOrCreate()
> val sql1 =
> """
> |create temporary view t1 as select * from values
> |(1)
> |as grouping(a)
> """.stripMargin
> val sql2 =
> """
> |create temporary view t2 as select * from values
> |(1)
> |as grouping(a)
> """.stripMargin
> val sql3 =
> """
> |create temporary view t3 as select * from values
> |(1),
> |(1)
> |as grouping(a)
> """.stripMargin
> val sql4 =
> """
> |create temporary view t4 as select * from values
> |(1),
> |(1)
> |as grouping(a)
> """.stripMargin
> val sqlA =
> """
> |create temporary view ta as
> |select a, 'a' as tag from t1 union all
> |select a, 'b' as tag from t2
> """.stripMargin
> val sqlB =
> """
> |create temporary view tb as
> |select a, 'a' as tag from t3 union all
> |select a, 'b' as tag from t4
> """.stripMargin
> val sql =
> """
> |select tb.* from ta inner join tb on
> |ta.a = tb.a and
> |ta.tag = tb.tag
> """.stripMargin
> spark.sql(sql1)
> spark.sql(sql2)
> spark.sql(sql3)
> spark.sql(sql4)
> spark.sql(sqlA)
> spark.sql(sqlB)
> spark.sql(sql).show()
> {code}
> The results which is incorrect:
> {code}
> +---+---+
> | a|tag|
> +---+---+
> | 1| b|
> | 1| b|
> | 1| a|
> | 1| a|
> | 1| b|
> | 1| b|
> | 1| a|
> | 1| a|
> +---+---+
> {code}
> The correct results should be:
> {code}
> +---+---+
> | a|tag|
> +---+---+
> | 1| a|
> | 1| a|
> | 1| b|
> | 1| b|
> +---+---+
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org