You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/12/07 01:36:15 UTC

[GitHub] [spark] HyukjinKwon commented on a change in pull request #26781: [SPARK-30151][SQL] Issue better error message when user-specified schema mismatched

HyukjinKwon commented on a change in pull request #26781: [SPARK-30151][SQL] Issue better error message when user-specified schema mismatched
URL: https://github.com/apache/spark/pull/26781#discussion_r355089705
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
 ##########
 @@ -339,11 +339,34 @@ case class DataSource(
         dataSource.createRelation(sparkSession.sqlContext, caseInsensitiveOptions)
       case (_: SchemaRelationProvider, None) =>
         throw new AnalysisException(s"A schema needs to be specified when using $className.")
-      case (dataSource: RelationProvider, Some(schema)) =>
+      case (dataSource: RelationProvider, Some(specifiedSchema)) =>
         val baseRelation =
           dataSource.createRelation(sparkSession.sqlContext, caseInsensitiveOptions)
-        if (baseRelation.schema != schema) {
-          throw new AnalysisException(s"$className does not allow user-specified schemas.")
+        val persistentSchema = baseRelation.schema
+        val persistentSize = persistentSchema.size
+        val specifiedSize = specifiedSchema.size
+        if (persistentSize == specifiedSize) {
+          val (persistentFields, specifiedFields) = persistentSchema.zip(specifiedSchema)
+            .filter { case (existedField, userField) => existedField != userField }
+            .unzip
+          if (persistentFields.nonEmpty) {
+            val errorMsg =
+              s"Mismatched fields detected between persistent schema and user specified schema: " +
 
 Review comment:
   nit: seems like we can remove `s`s.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org