You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Pavel Chernikov (Jira)" <ji...@apache.org> on 2021/03/22 23:08:00 UTC

[jira] [Created] (SPARK-34829) transform_values return identical values while operating on complex types

Pavel Chernikov created SPARK-34829:
---------------------------------------

             Summary: transform_values return identical values while operating on complex types
                 Key: SPARK-34829
                 URL: https://issues.apache.org/jira/browse/SPARK-34829
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.1.1
            Reporter: Pavel Chernikov


If map values are {{StructType}} s then behavior of {{transform_values}} is inconsistent (it may return identical values). To be more precise, it looks like it returns identical values if the return type is {{AnyRef}}.

Consider following examples:

 
{code:java}
case class Bar(i: Int)
val square = udf((b: Bar) => b.i * b.i)
val df = Seq(Map(1 -> Bar(1), 2 -> Bar(2), 3 -> Bar(3))).toDF("map")
df.withColumn("map_square", transform_values(col("map"), (_, v) => square(v))).show(truncate = false)
+------------------------------+------------------------+
|map                           |map_square              |
+------------------------------+------------------------+
|{1 -> {1}, 2 -> {2}, 3 -> {3}}|{1 -> 1, 2 -> 4, 3 -> 9}|
+------------------------------+------------------------+
{code}
vs 
{code:java}
case class Bar(i: Int)
case class BarSquare(i: Int)
val square = udf((b: Bar) => BarSquare(b.i * b.i))
val df = Seq(Map(1 -> Bar(1), 2 -> Bar(2), 3 -> Bar(3))).toDF("map")
df.withColumn("map_square", transform_values(col("map"), (_, v) => square(v))).show(truncate = false)
+------------------------------+------------------------------+
|map                           |map_square                    |
+------------------------------+------------------------------+
|{1 -> {1}, 2 -> {2}, 3 -> {3}}|{1 -> {9}, 2 -> {9}, 3 -> {9}}|
+------------------------------+------------------------------+
{code}
and even just

 
{code:java}
case class Foo(s: String)
val reverse = udf((f: Foo) => f.s.reverse)
val df = Seq(Map(1 -> Foo("abc"), 2 -> Foo("klm"), 3 -> Foo("xyz"))).toDF("map")
df.withColumn("map_reverse", transform_values(col("map"), (_, v) => reverse(v))).show(truncate = false)
+------------------------------------+------------------------------+
|map                                 |map_reverse                   |
+------------------------------------+------------------------------+
|{1 -> {abc}, 2 -> {klm}, 3 -> {xyz}}|{1 -> zyx, 2 -> zyx, 3 -> zyx}|
+------------------------------------+------------------------------+

{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org