You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "L. C. Hsieh (Jira)" <ji...@apache.org> on 2021/10/07 02:28:00 UTC

[jira] [Resolved] (SPARK-36918) unionByName shouldn't consider types when comparing structs

     [ https://issues.apache.org/jira/browse/SPARK-36918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

L. C. Hsieh resolved SPARK-36918.
---------------------------------
    Fix Version/s: 3.3.0
       Resolution: Fixed

Issue resolved by pull request 34166
[https://github.com/apache/spark/pull/34166]

> unionByName shouldn't consider types when comparing structs
> -----------------------------------------------------------
>
>                 Key: SPARK-36918
>                 URL: https://issues.apache.org/jira/browse/SPARK-36918
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.2.0
>            Reporter: Adam Binford
>            Assignee: Adam Binford
>            Priority: Major
>             Fix For: 3.3.0
>
>
> Improvement/follow-on of https://issues.apache.org/jira/browse/SPARK-35290.
> We use StructType.sameType to see if we need to recreate the struct, but this can lead to false positives if the structure is the same but the types are different, and will lead to simply creating a new struct that's exactly the same as the original. This can cause significant overhead when unioning multiple deeply nested nullable structs, as each time it's recreated it gets wrapped in a If(IsNull()). Only comparing the field names can lead to more efficient plans.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org