You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Supritha (JIRA)" <ji...@apache.org> on 2019/03/28 09:28:00 UTC

[jira] [Created] (SPARK-27318) Join operation on bucketing table fails with base adaptive enabled

Supritha created SPARK-27318:
--------------------------------

             Summary: Join operation on bucketing table fails with base adaptive enabled
                 Key: SPARK-27318
                 URL: https://issues.apache.org/jira/browse/SPARK-27318
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.4.0
            Reporter: Supritha


Join Operation on bucketed table is failing.

Steps to reproduce the issue.

spark.sql("set spark.sql.adaptive.enabled=true")

1. Create table bukcet3 and bucket4 Table as below and load the data.
 sql("create table bucket3(id3 int,country3 String, sports3 String) row format delimited fields terminated by ','").show()
 sql("create table bucket4(id4 int,country4 String) row format delimited fields terminated by ','").show()

sql("load data local inpath '/opt/abhidata/bucket2.txt' into table bucket3").show()
 sql("load data local inpath '/opt/abhidata/bucket3.txt' into table bucket4").show()

2. Create bucketing table as below
spark.sqlContext.table("bucket3").write.bucketBy(3, "id3").saveAsTable("bucketed_table_3");

spark.sqlContext.table("bucket4").write.bucketBy(4, "id4").saveAsTable("bucketed_table_4");

3. Execute the join query on the bucketed table 
sql("select * from bucketed_table_3 join bucketed_table_4 on bucketed_table_3.id3 = bucketed_table_4.id4").show()

scala> sql("select * from bucketed_table_3 join bucketed_table_4 on bucketed_table_3.id3 = bucketed_table_4.id4").show()

 
{code:java}
// code placeholder
{code}
java.lang.IllegalArgumentException: requirement failed: PartitioningCollection requires all of its partitionings have the same numPartitions. at scala.Predef$.require(Predef.scala:224) at org.apache.spark.sql.catalyst.plans.physical.PartitioningCollection.<init>(partitioning.scala:291) at org.apache.spark.sql.execution.joins.SortMergeJoinExec.outputPartitioning(SortMergeJoinExec.scala:69) at org.apache.spark.sql.execution.exchange.EnsureRequirements$$anonfun$org$apache$spark$sql$execution$exchange$EnsureRequirements$$ensureDistributionAndOrdering$1.apply(EnsureRequirements.scala:150) at org.apache.spark.sql.execution.exchange.EnsureRequirements$$anonfun$org$apache$spark$sql$execution$exchange$EnsureRequirements$$ensureDistributionAndOrdering$1.apply(EnsureRequirements.scala:149) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.immutable.List.foreach(List.scala:392) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.immutable.List.map(List.scala:296) at org.apache.spark.sql.execution.exchange.EnsureRequirements.org$apache$spark$sql$execution$exchange$EnsureRequirements$$ensureDistributionAndOrdering(EnsureRequirements.scala:149) at org.apache.spark.sql.execution.exchange.EnsureRequirements$$anonfun$apply$1.applyOrElse(EnsureRequirements.scala:304) at org.apache.spark.sql.execution.exchange.EnsureRequirements$$anonfun$apply$1.applyOrElse(EnsureRequirements.scala:296) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$2.apply(TreeNode.scala:282) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$2.apply(TreeNode.scala:282) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:281) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:275) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:275) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:326) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187) at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:324) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:275) at org.apache.spark.sql.execution.exchange.EnsureRequirements.apply(EnsureRequirements.scala:296) at org.apache.spark.sql.execution.exchange.EnsureRequirements.apply(EnsureRequirements.scala:38) at org.apache.spark.sql.execution.QueryExecution$$anonfun$prepareForExecution$1.apply(QueryExecution.scala:87) at org.apache.spark.sql.execution.QueryExecution$$anonfun$prepareForExecution$1.apply(QueryExecution.scala:87) at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:124) at scala.collection.immutable.List.foldLeft(List.scala:84) at org.apache.spark.sql.execution.QueryExecution.prepareForExecution(QueryExecution.scala:87) at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:77) at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:77) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3360) at org.apache.spark.sql.Dataset.head(Dataset.scala:2545) at org.apache.spark.sql.Dataset.take(Dataset.scala:2759) at org.apache.spark.sql.Dataset.getRows(Dataset.scala:255) at org.apache.spark.sql.Dataset.showString(Dataset.scala:292) at org.apache.spark.sql.Dataset.show(Dataset.scala:746) at org.apache.spark.sql.Dataset.show(Dataset.scala:705) at org.apache.spark.sql.Dataset.show(Dataset.scala:714) ... 49 elided  

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org