You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2018/01/23 20:52:00 UTC
[jira] [Assigned] (SPARK-23195) Hint of cached data is lost
[ https://issues.apache.org/jira/browse/SPARK-23195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-23195:
------------------------------------
Assignee: Xiao Li (was: Apache Spark)
> Hint of cached data is lost
> ---------------------------
>
> Key: SPARK-23195
> URL: https://issues.apache.org/jira/browse/SPARK-23195
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.2.1, 2.3.0
> Reporter: Xiao Li
> Assignee: Xiao Li
> Priority: Major
>
> {noformat}
> withSQLConf(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1") {
> val df1 = spark.createDataFrame(Seq((1, "4"), (2, "2"))).toDF("key", "value")
> val df2 = spark.createDataFrame(Seq((1, "1"), (2, "2"))).toDF("key", "value")
> broadcast(df2).cache()
> df2.collect()
> val df3 = df1.join(df2, Seq("key"), "inner")
> val numBroadCastHashJoin = df3.queryExecution.executedPlan.collect {
> case b: BroadcastHashJoinExec => b
> }.size
> assert(numBroadCastHashJoin === 1)
> }
> {noformat}
> The broadcast hint is not respected.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org