You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "nchammas (via GitHub)" <gi...@apache.org> on 2024/01/24 06:03:36 UTC

[PR] [WIP][SPARK-46819] Move error categories and states into JSON [spark]

nchammas opened a new pull request, #44863:
URL: https://github.com/apache/spark/pull/44863

   ### What changes were proposed in this pull request?
   
   Move the error categories and states out of the Markdown table in the error README and into standalone JSON files.
   
   I am using the terms "error category" and "error state" as defined in SPARK-46810.
   
   TODO:
   - [ ] Add/update some tests to check the schema and contents of these new files.
   
   ### Why are the changes needed?
   
   The data captured in Markdown is not automation-friendly. In future work, I intend to automate the generation of [sql-error-conditions-sqlstates.md][states] using these new JSON files.
   
   Additionally, there are many instances of duplicate keys and inconsistent descriptions for the error categories, which this change resolves by moving the data into JSON. This also enables IDE support for basic validation.
   
   [states]: https://github.com/apache/spark/blob/6ea2094da252ce96d875696dbd68c4d1bc673dc4/docs/sql-error-conditions-sqlstates.md
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   New/updated unit tests.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [WIP][SPARK-46819] Move error categories and states into JSON [spark]

Posted by "itholic (via GitHub)" <gi...@apache.org>.
itholic commented on PR #44863:
URL: https://github.com/apache/spark/pull/44863#issuecomment-1907438834

   also cc @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46819][CORE] Move error categories and states into JSON [spark]

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk commented on PR #44863:
URL: https://github.com/apache/spark/pull/44863#issuecomment-1912152199

   +1, LGTM. Merging to master.
   Thank you, @nchammas and @itholic @srielau for review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46819][CORE] Move error categories and states into JSON [spark]

Posted by "nchammas (via GitHub)" <gi...@apache.org>.
nchammas commented on PR #44863:
URL: https://github.com/apache/spark/pull/44863#issuecomment-1910546630

   > @nchammas I wonder did you generate [error-states.json](https://github.com/apache/spark/pull/44863/files#diff-09927682f352e689f38cdbce800dba011d33000d09b1b2c4ff9bd4aa49ac265f) automatically by a script?
   
   A little scripting and a little IDE kung-fu.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46819][CORE] Move error categories and states into JSON [spark]

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk commented on code in PR #44863:
URL: https://github.com/apache/spark/pull/44863#discussion_r1466581920


##########
common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala:
##########
@@ -153,3 +153,14 @@ private case class ErrorSubInfo(message: Seq[String]) {
   @JsonIgnore
   val messageTemplate: String = message.mkString("\n")
 }
+
+/**
+ * Information associated with an error state / SQLSTATE.
+ *
+ * @param description
+ * @param origin The database system where this error state was first defined.

Review Comment:
   Precisely, `database system` -> DBMS



##########
common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala:
##########
@@ -153,3 +153,14 @@ private case class ErrorSubInfo(message: Seq[String]) {
   @JsonIgnore
   val messageTemplate: String = message.mkString("\n")
 }
+
+/**
+ * Information associated with an error state / SQLSTATE.
+ *
+ * @param description

Review Comment:
   Add a sentence about the parameter.



##########
common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala:
##########
@@ -153,3 +153,14 @@ private case class ErrorSubInfo(message: Seq[String]) {
   @JsonIgnore
   val messageTemplate: String = message.mkString("\n")
 }
+
+/**
+ * Information associated with an error state / SQLSTATE.
+ *
+ * @param description
+ * @param origin The database system where this error state was first defined.
+ * @param standard Whether this error state is part of the SQL standard.
+ * @param usedBy What database systems use this error state.
+ */
+private case class ErrorStateInfo(
+  description: String, origin: String, standard: String, usedBy: List[String])

Review Comment:
   Please, fix indentation, see https://github.com/databricks/scala-style-guide?tab=readme-ov-file#indent
   ```suggestion
       description: String,
       origin: String,
       standard: String,
       usedBy: List[String])
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46819][CORE] Move error categories and states into JSON [spark]

Posted by "srielau (via GitHub)" <gi...@apache.org>.
srielau commented on PR #44863:
URL: https://github.com/apache/spark/pull/44863#issuecomment-1909383470

   Thanks for doing this. Adding a row to this table is royal pain in the behind. This will make it easier.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [WIP][SPARK-46819] Move error categories and states into JSON [spark]

Posted by "nchammas (via GitHub)" <gi...@apache.org>.
nchammas commented on PR #44863:
URL: https://github.com/apache/spark/pull/44863#issuecomment-1907432511

   @MaxGekk - Before I continue down this path, do you like where this is going?
   
   Asking you per @itholic's [comment on SPARK-46810][1].
   
   [1]: https://issues.apache.org/jira/browse/SPARK-46810?focusedCommentId=17810156&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17810156


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46819][CORE] Move error categories and states into JSON [spark]

Posted by "nchammas (via GitHub)" <gi...@apache.org>.
nchammas commented on PR #44863:
URL: https://github.com/apache/spark/pull/44863#issuecomment-1908781506

   @MaxGekk - The naming choices I made here are based on my proposal in SPARK-46810. Are you OK with what I wrote there? (If yes, could you chime in on the ticket just for the record, please?)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46819][CORE] Move error categories and states into JSON [spark]

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk commented on code in PR #44863:
URL: https://github.com/apache/spark/pull/44863#discussion_r1467713122


##########
common/utils/src/main/resources/error/README.md:
##########
@@ -116,1241 +116,4 @@ The following SQLSTATEs are collated from:
 - PostgreSQL 15
 - Oracle 12 (last published)
 - SQL Server
-- Redshift.
-
-<!-- SQLSTATE table start -->
-|SQLSTATE |Class|Condition                                         |Subclass|Subcondition                                                |Origin         |Standard|Used By                                                                     |

Review Comment:
   This is not public doc, correct?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46819][CORE] Move error categories and states into JSON [spark]

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk closed pull request #44863: [SPARK-46819][CORE] Move error categories and states into JSON
URL: https://github.com/apache/spark/pull/44863


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46819][CORE] Move error categories and states into JSON [spark]

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk commented on PR #44863:
URL: https://github.com/apache/spark/pull/44863#issuecomment-1908818132

   @nchammas I will review this PR tomorrow. Also cc @srielau


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46819][CORE] Move error categories and states into JSON [spark]

Posted by "nchammas (via GitHub)" <gi...@apache.org>.
nchammas commented on PR #44863:
URL: https://github.com/apache/spark/pull/44863#issuecomment-1908907064

   No rush at all. (Didn't mean to nag. Dunno if my comment came off that way.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46819][CORE] Move error categories and states into JSON [spark]

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk commented on PR #44863:
URL: https://github.com/apache/spark/pull/44863#issuecomment-1912150488

   The failure [Run / Run Spark on Kubernetes Integration test](https://github.com/nchammas/spark/actions/runs/7664194822/job/20888189765#logs) is not related to the changes, I do believe.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46819][CORE] Move error categories and states into JSON [spark]

Posted by "nchammas (via GitHub)" <gi...@apache.org>.
nchammas commented on code in PR #44863:
URL: https://github.com/apache/spark/pull/44863#discussion_r1467766950


##########
common/utils/src/main/resources/error/README.md:
##########
@@ -116,1241 +116,4 @@ The following SQLSTATEs are collated from:
 - PostgreSQL 15
 - Oracle 12 (last published)
 - SQL Server
-- Redshift.
-
-<!-- SQLSTATE table start -->
-|SQLSTATE |Class|Condition                                         |Subclass|Subcondition                                                |Origin         |Standard|Used By                                                                     |

Review Comment:
   Correct, not public. I will be opening separate PRs to update our public facing documentation on error classes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org