You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "sandeshyapuram (Jira)" <ji...@apache.org> on 2019/10/31 09:00:00 UTC
[jira] [Created] (SPARK-29682) Failure when resolving conflicting
references in Join:
sandeshyapuram created SPARK-29682:
--------------------------------------
Summary: Failure when resolving conflicting references in Join:
Key: SPARK-29682
URL: https://issues.apache.org/jira/browse/SPARK-29682
Project: Spark
Issue Type: Bug
Components: Spark Submit
Affects Versions: 2.4.3
Reporter: sandeshyapuram
When I try to self join a parentDf with multiple childDf say childDf1 ... ...
where childDfs are derived after a cube or rollup and are filtered based on group bys,
I get and error
{{Failure when resolving conflicting references in Join: }}
This shows a long error message which is quite unreadable. On the other hand, if I replace cube or rollup with old groupBy, it works without issues.
*Sample code:*
{code:java}
val numsDF = sc.parallelize(Seq(1,2,3,4,5,6)).toDF("nums")val cubeDF = numsDF
.cube("nums")
.agg(
max(lit(0)).as("agcol"),
grouping_id().as("gid")
)
val group0 = cubeDF.filter(col("gid") <=> lit(0))
val group1 = cubeDF.filter(col("gid") <=> lit(1))cubeDF.printSchema
group0.printSchema
group1.printSchema//Recreating cubeDf
cubeDF.select("nums").distinct
.join(group0, Seq("nums"), "inner")
.join(group1, Seq("nums"), "inner")
.show{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org