You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Guojian Li (Jira)" <ji...@apache.org> on 2020/08/17 11:17:00 UTC

[jira] [Created] (SPARK-32638) TypeCoercion$WidenSetOperationTypes causes subquery attribute unresolved exception

Guojian Li created SPARK-32638:
----------------------------------

             Summary: TypeCoercion$WidenSetOperationTypes causes subquery  attribute  unresolved exception
                 Key: SPARK-32638
                 URL: https://issues.apache.org/jira/browse/SPARK-32638
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.3.4
            Reporter: Guojian Li


I am migrating sql from mysql to spark sql. Below is code to reproduce the exception:

 
{code:java}
val spark = SparkSession.builder()
 .master("local")
 .appName("Word Count")
 .getOrCreate()
spark.sparkContext.setLogLevel("TRACE")
val DecimalType = DataTypes.createDecimalType(20, 2)
val schema = StructType(List(
 StructField("a", DecimalType, true)
))
val dataList = new util.ArrayList[Row]()


val df=spark.createDataFrame(dataList,schema)
df.printSchema()
df.createTempView("test")
val sql=
 """
 |SELECT t.kpi_04 FROM
 |(
 | SELECT a as `kpi_04` FROM test
 | UNION ALL
 | SELECT a+a as `kpi_04` FROM test
 |) t
 |
 """.stripMargin
spark.sql(sql)
{code}
 

 
{code:java}
Exception in thread "main" org.apache.spark.sql.AnalysisException: Resolved attribute(s) kpi_04#2 missing from kpi_04#4 in operator !Project [kpi_04#2]. Attribute(s) with the same name appear in the operation: kpi_04. Please check if the right attribute(s) are used.;;
!Project [kpi_04#2]
+- SubqueryAlias t
 +- Union
 :- Project [cast(kpi_04#2 as decimal(21,2)) AS kpi_04#4]
 : +- Project [a#0 AS kpi_04#2]
 : +- SubqueryAlias test
 : +- LocalRelation <empty>, [a#0]
 +- Project [kpi_04#3]
 +- Project [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + promote_precision(cast(a#0 as decimal(21,2)))), DecimalType(21,2)) AS kpi_04#3]
 +- SubqueryAlias test
 +- LocalRelation <empty>, [a#0]{code}
 

 

Base the trace log ,seemly the WidenSetOperationTypes add new outer project layer. It caused the parent query lose the reference to subquery. 

 

 
{code:java}
 
=== Applying Rule org.apache.spark.sql.catalyst.analysis.TypeCoercion$WidenSetOperationTypes ===
!'Project [kpi_04#2] !Project [kpi_04#2]
!+- 'SubqueryAlias t +- SubqueryAlias t
! +- 'Union +- Union
! :- Project [a#0 AS kpi_04#2] :- Project [cast(kpi_04#2 as decimal(21,2)) AS kpi_04#4]
! : +- SubqueryAlias test : +- Project [a#0 AS kpi_04#2]
! : +- LocalRelation <empty>, [a#0] : +- SubqueryAlias test
! +- Project [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + promote_precision(cast(a#0 as decimal(21,2)))), DecimalType(21,2)) AS kpi_04#3] : +- LocalRelation <empty>, [a#0]
! +- SubqueryAlias test +- Project [kpi_04#3]
! +- LocalRelation <empty>, [a#0] +- Project [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + promote_precision(cast(a#0 as decimal(21,2)))), DecimalType(21,2)) AS kpi_04#3]
! +- SubqueryAlias test
! +- LocalRelation <empty>, [a#0]
{code}
 

  in the source code ,WidenSetOperationTypes.scala. it is  a intent behavior, but  miss some edge case. 

I hope someone can help me out to fix it . 

 

 
{code:java}
if (targetTypes.nonEmpty) {
 // Add an extra Project if the targetTypes are different from the original types.
 children.map(widenTypes(_, targetTypes))
} else {
 // Unable to find a target type to widen, then just return the original set.
 children
}{code}
 

 

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org