You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "chendihao (Jira)" <ji...@apache.org> on 2020/10/30 09:14:00 UTC

[jira] [Created] (SPARK-33300) Rule SimplifyCasts will not work for nested columns

chendihao created SPARK-33300:
---------------------------------

             Summary: Rule SimplifyCasts will not work for nested columns
                 Key: SPARK-33300
                 URL: https://issues.apache.org/jira/browse/SPARK-33300
             Project: Spark
          Issue Type: Bug
          Components: Optimizer, SQL
    Affects Versions: 3.0.0
            Reporter: chendihao


We use SparkSQL and Catalyst to optimize the Spark job. We have read the source code and test the rule of SimplifyCasts which will work for simple SQL without nested cast.

The SQL "select cast(string_date as string) from t1" will be optimized.

```
== Analyzed Logical Plan ==
string_date: string
Project [cast(string_date#12 as string) AS string_date#24]
+- SubqueryAlias t1
 +- LogicalRDD [name#8, c1#9, c2#10, c5#11L, string_date#12, string_timestamp#13, timestamp_field#14, bool_field#15], false

== Optimized Logical Plan ==
Project [string_date#12]
+- LogicalRDD [name#8, c1#9, c2#10, c5#11L, string_date#12, string_timestamp#13, timestamp_field#14, bool_field#15], false
```

However, it fail to optimize with the nested cast like this "select cast(cast(string_date as string) as string) from t1".

```
== Analyzed Logical Plan ==
CAST(CAST(string_date AS STRING) AS STRING): string
Project [cast(cast(string_date#12 as string) as string) AS CAST(CAST(string_date AS STRING) AS STRING)#24]
+- SubqueryAlias t1
 +- LogicalRDD [name#8, c1#9, c2#10, c5#11L, string_date#12, string_timestamp#13, timestamp_field#14, bool_field#15], false

== Optimized Logical Plan ==
Project [string_date#12 AS CAST(CAST(string_date AS STRING) AS STRING)#24]
+- LogicalRDD [name#8, c1#9, c2#10, c5#11L, string_date#12, string_timestamp#13, timestamp_field#14, bool_field#15], false
```

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org