You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Ankit Raj Boudh (Jira)" <ji...@apache.org> on 2019/12/19 01:04:00 UTC
[jira] [Commented] (SPARK-30298) bucket join cannot work for
self-join with views
[ https://issues.apache.org/jira/browse/SPARK-30298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999636#comment-16999636 ]
Ankit Raj Boudh commented on SPARK-30298:
-----------------------------------------
I will raise pr for this
> bucket join cannot work for self-join with views
> ------------------------------------------------
>
> Key: SPARK-30298
> URL: https://issues.apache.org/jira/browse/SPARK-30298
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.0.0
> Reporter: Xiaoju Wu
> Priority: Minor
>
> This UT may fail at the last line:
> {code:java}
> test("bucket join cannot work for self-join with views") {
> withSQLConf(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "1") {
> withTable("t1") {
> val df = (0 until 20).map(i => (i, i)).toDF("i", "j").as("df")
> df.write
> .format("parquet")
> .bucketBy(8, "i")
> .saveAsTable("t1")
> sql(s"create view v1 as select * from t1").collect()
> val plan1 = sql("SELECT * FROM t1 a JOIN t1 b ON a.i = b.i").queryExecution.executedPlan
> assert(plan1.collect { case exchange : ShuffleExchangeExec => exchange }.isEmpty)
> val plan2 = sql("SELECT * FROM t1 a JOIN v1 b ON a.i = b.i").queryExecution.executedPlan
> assert(plan2.collect { case exchange : ShuffleExchangeExec => exchange }.isEmpty)
> }
> }
> }
> {code}
> It's because View will add Project with Alias, then Join's requiredDistribution is based on Alias, but ProjectExec passes child's outputPartition up without Alias. Then the satisfies check cannot meet in EnsureRequirement.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org