You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by ma...@apache.org on 2022/11/23 06:14:11 UTC
[spark] branch master updated: [SPARK-41206][SQL][FOLLOWUP] Make result of `checkColumnNameDuplication` stable to fix `COLUMN_ALREADY_EXISTS` check failed with Scala 2.13
This is an automated email from the ASF dual-hosted git repository.
maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new e42d3836af9 [SPARK-41206][SQL][FOLLOWUP] Make result of `checkColumnNameDuplication` stable to fix `COLUMN_ALREADY_EXISTS` check failed with Scala 2.13
e42d3836af9 is described below
commit e42d3836af9eea881868c80f3c2cbc29e1d7b4f1
Author: yangjie01 <ya...@baidu.com>
AuthorDate: Wed Nov 23 09:13:56 2022 +0300
[SPARK-41206][SQL][FOLLOWUP] Make result of `checkColumnNameDuplication` stable to fix `COLUMN_ALREADY_EXISTS` check failed with Scala 2.13
### What changes were proposed in this pull request?
This pr add a sort when `columnAlreadyExistsError` will be thrown to make the result of `SchemaUtils#checkColumnNameDuplication` stable.
### Why are the changes needed?
Fix `COLUMN_ALREADY_EXISTS` check failed with Scala 2.13
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
- Pass GA
- Manual test:
```
dev/change-scala-version.sh 2.13
build/sbt clean "sql/testOnly org.apache.spark.sql.DataFrameSuite" -Pscala-2.13
build/sbt "sql/testOnly org.apache.spark.sql.execution.datasources.json.JsonV1Suite" -Pscala-2.13
build/sbt "sql/testOnly org.apache.spark.sql.execution.datasources.json.JsonV2Suite" -Pscala-2.13
build/sbt "sql/testOnly org.apache.spark.sql.execution.datasources.json.JsonLegacyTimeParserSuite" -Pscala-2.13
```
All tests passed
Closes #38764 from LuciferYang/SPARK-41206.
Authored-by: yangjie01 <ya...@baidu.com>
Signed-off-by: Max Gekk <ma...@gmail.com>
---
sql/catalyst/src/main/scala/org/apache/spark/sql/util/SchemaUtils.scala | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/util/SchemaUtils.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/util/SchemaUtils.scala
index aac96a9b56c..d202900381a 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/util/SchemaUtils.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/util/SchemaUtils.scala
@@ -107,7 +107,7 @@ private[spark] object SchemaUtils {
val names = if (caseSensitiveAnalysis) columnNames else columnNames.map(_.toLowerCase)
// scalastyle:on caselocale
if (names.distinct.length != names.length) {
- val columnName = names.groupBy(identity).collectFirst {
+ val columnName = names.groupBy(identity).toSeq.sortBy(_._1).collectFirst {
case (x, ys) if ys.length > 1 => x
}.get
throw QueryCompilationErrors.columnAlreadyExistsError(columnName)
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org