You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/06/03 11:50:00 UTC

[GitHub] [spark] wangyum opened a new pull request, #36760: [SPARK-39374][SQL] Improve error message for user specified column list

wangyum opened a new pull request, #36760:
URL: https://github.com/apache/spark/pull/36760

   ### What changes were proposed in this pull request?
   
   This PR improves error message for user specified column list. For example:
   ```sql
   create table t1(c1 int, c2 bigint, c3 string) using parquet;
   insert into t1(c1, c2, c4) values(1, 2, 3);
   ```
   Before this PR:
   ```
   Cannot resolve column name c4; line 1 pos 0
   org.apache.spark.sql.AnalysisException: Cannot resolve column name c4; line 1 pos 0
   ```
   After this PR:
   ```
   [MISSING_COLUMN] Column 'c4' does not exist. Did you mean one of the following? [c1, c2, c3]; line 1 pos 0
   org.apache.spark.sql.AnalysisException: [MISSING_COLUMN] Column 'c4' does not exist. Did you mean one of the following? [c1, c2, c3]; line 1 pos 0
   ```
   
   ### Why are the changes needed?
   
   Improve error message.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Existing test.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk closed pull request #36760: [SPARK-39374][SQL] Improve error message for user specified column list

Posted by GitBox <gi...@apache.org>.
MaxGekk closed pull request #36760: [SPARK-39374][SQL] Improve error message for user specified column list
URL: https://github.com/apache/spark/pull/36760


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #36760: [SPARK-39374][SQL] Improve error message for user specified column list

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on PR #36760:
URL: https://github.com/apache/spark/pull/36760#issuecomment-1146764006

   +1, LGTM. Merging to master.
   Thank you, @wangyum and @singhpk234 for review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on a diff in pull request #36760: [SPARK-39374][SQL] Improve error message for user specified column list

Posted by GitBox <gi...@apache.org>.
wangyum commented on code in PR #36760:
URL: https://github.com/apache/spark/pull/36760#discussion_r889430728


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -3424,9 +3424,10 @@ class Analyzer(override val catalogManager: CatalogManager)
         i.userSpecifiedCols, "in the column list", resolver)
 
       i.userSpecifiedCols.map { col =>
-          i.table.resolve(Seq(col), resolver)
-            .getOrElse(throw QueryCompilationErrors.cannotResolveUserSpecifiedColumnsError(
-              col, i.table))
+        i.table.resolve(Seq(col), resolver)
+          .getOrElse(i.failAnalysis(
+            errorClass = "MISSING_COLUMN",
+            messageParameters = Array(col, i.table.output.map(_.name).mkString(", "))))

Review Comment:
   The following query also shows all columns:
   ```sql
   select not_exist_col from tbl;
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] singhpk234 commented on a diff in pull request #36760: [SPARK-39374][SQL] Improve error message for user specified column list

Posted by GitBox <gi...@apache.org>.
singhpk234 commented on code in PR #36760:
URL: https://github.com/apache/spark/pull/36760#discussion_r889140462


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -3424,9 +3424,10 @@ class Analyzer(override val catalogManager: CatalogManager)
         i.userSpecifiedCols, "in the column list", resolver)
 
       i.userSpecifiedCols.map { col =>
-          i.table.resolve(Seq(col), resolver)
-            .getOrElse(throw QueryCompilationErrors.cannotResolveUserSpecifiedColumnsError(
-              col, i.table))
+        i.table.resolve(Seq(col), resolver)
+          .getOrElse(i.failAnalysis(
+            errorClass = "MISSING_COLUMN",
+            messageParameters = Array(col, i.table.output.map(_.name).mkString(", "))))

Review Comment:
   [question] we can have a table which has  > 100 columns as well, showing all the columns in error message would make it unusually large. Are we ok with it ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org