You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Josh Rosen (JIRA)" <ji...@apache.org> on 2019/05/14 06:22:00 UTC

[jira] [Updated] (SPARK-6743) Join with empty projection on one side produces invalid results

     [ https://issues.apache.org/jira/browse/SPARK-6743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Josh Rosen updated SPARK-6743:
------------------------------
    Labels: correctness  (was: )

> Join with empty projection on one side produces invalid results
> ---------------------------------------------------------------
>
>                 Key: SPARK-6743
>                 URL: https://issues.apache.org/jira/browse/SPARK-6743
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.3.0
>            Reporter: Santiago M. Mola
>            Assignee: Michael Armbrust
>            Priority: Critical
>              Labels: correctness
>             Fix For: 1.4.0
>
>
> {code:java}
> val sqlContext = new SQLContext(sc)
> val tab0 = sc.parallelize(Seq(
>       (83,0,38),
>       (26,0,79),
>       (43,81,24)
>     ))
>     sqlContext.registerDataFrameAsTable(sqlContext.createDataFrame(tab0), "tab0")
> sqlContext.cacheTable("tab0")   
> val df1 = sqlContext.sql("SELECT tab0._2, cor0._2 FROM tab0, tab0 cor0 GROUP BY tab0._2, cor0._2")
> val result1 = df1.collect()
> val df2 = sqlContext.sql("SELECT cor0._2 FROM tab0, tab0 cor0 GROUP BY cor0._2")
> val result2 = df2.collect()
> val df3 = sqlContext.sql("SELECT cor0._2 FROM tab0 cor0 GROUP BY cor0._2")
> val result3 = df3.collect()
> {code}
> Given the previous code, result2 equals to Row(43), Row(83), Row(26), which is wrong. These results correspond to cor0._1, instead of cor0._2. Correct results would be Row(0), Row(81), which are ok for the third query. The first query also produces valid results, and the only difference is that the left side of the join is not empty.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org