You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by we...@apache.org on 2020/02/20 14:25:56 UTC
[spark] branch branch-3.0 updated: [SPARK-26071][FOLLOWUP] Improve
migration guide of disallowing map type map key
This is an automated email from the ASF dual-hosted git repository.
wenchen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.0 by this push:
new c1000a6 [SPARK-26071][FOLLOWUP] Improve migration guide of disallowing map type map key
c1000a6 is described below
commit c1000a6bdce53f171ff00ea03b515950aaff4f95
Author: Wenchen Fan <we...@databricks.com>
AuthorDate: Thu Feb 20 22:10:04 2020 +0800
[SPARK-26071][FOLLOWUP] Improve migration guide of disallowing map type map key
### What changes were proposed in this pull request?
mention the workaround if users do want to use map type as key, and add a test to demonstrate it.
### Why are the changes needed?
it's better to provide an alternative when we ban something.
### Does this PR introduce any user-facing change?
no
### How was this patch tested?
N/A
Closes #27621 from cloud-fan/map.
Authored-by: Wenchen Fan <we...@databricks.com>
Signed-off-by: Wenchen Fan <we...@databricks.com>
(cherry picked from commit 704d249a56325fce4a8179a2a7a242b9469aa6ec)
Signed-off-by: Wenchen Fan <we...@databricks.com>
---
docs/sql-migration-guide.md | 2 +-
.../test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala | 8 ++++++++
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md
index 0690127..9b74b45 100644
--- a/docs/sql-migration-guide.md
+++ b/docs/sql-migration-guide.md
@@ -43,7 +43,7 @@ license: |
- The `ADD JAR` command previously returned a result set with the single value 0. It now returns an empty result set.
- - In Spark version 2.4 and earlier, users can create map values with map type key via built-in function like `CreateMap`, `MapFromArrays`, etc. Since Spark 3.0, it's not allowed to create map values with map type key with these built-in functions. Users can still read map values with map type key from data source or Java/Scala collections, though they are not very useful.
+ - In Spark version 2.4 and earlier, users can create map values with map type key via built-in function such as `CreateMap`, `MapFromArrays`, etc. Since Spark 3.0, it's not allowed to create map values with map type key with these built-in functions. Users can use `map_entries` function to convert map to array<struct<key, value>> as a workaround. In addition, users can still read map values with map type key from data source or Java/Scala collections, though it is discouraged.
- In Spark version 2.4 and earlier, `Dataset.groupByKey` results to a grouped dataset with key attribute wrongly named as "value", if the key is non-struct type, e.g. int, string, array, etc. This is counterintuitive and makes the schema of aggregation queries weird. For example, the schema of `ds.groupByKey(...).count()` is `(value, count)`. Since Spark 3.0, we name the grouping attribute to "key". The old behaviour is preserved under a newly added configuration `spark.sql.legacy.data [...]
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala
index 341b325..b4b9a48 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala
@@ -3584,6 +3584,14 @@ class DataFrameFunctionsSuite extends QueryTest with SharedSparkSession {
}
}
}
+
+ test("SPARK-26071: convert map to array and use as map key") {
+ val df = Seq(Map(1 -> "a")).toDF("m")
+ intercept[AnalysisException](df.select(map($"m", lit(1))))
+ checkAnswer(
+ df.select(map(map_entries($"m"), lit(1))),
+ Row(Map(Seq(Row(1, "a")) -> 1)))
+ }
}
object DataFrameFunctionsSuite {
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org