You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "itholic (via GitHub)" <gi...@apache.org> on 2023/12/11 07:13:15 UTC

[PR] [SPARK-46360][PYTHON] Enhance error message debugging with new `getMessage` API [spark]

itholic opened a new pull request, #44292:
URL: https://github.com/apache/spark/pull/44292

   
   ### What changes were proposed in this pull request?
   
   This PR proposes to introduce `getMessage` to provide a standardized way for users to obtain a concise and clear error message.
   
   ### Why are the changes needed?
   
   Previously, extracting a simple and informative error message in PySpark was not straightforward. The internal `ErrorClassesReader.get_error_message` method was often used, but for JVM-originated errors not defined in `error_classes.py`, obtaining a succinct error message was challenging.
   
   The new `getMessage` API harmonizes error message retrieval across PySpark, leveraging existing JVM implementations to ensure consistency and clarity in the messages presented to the users.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes, this PR introduces a `getMessage` for directly accessing simplified error messages in PySpark.
   
   - **Before**: No official API for simplified error messages; excessive details in the error output:
       ```python
       from pyspark.sql.utils import AnalysisException
       
       try:
           spark.sql("""SELECT a""")
       except AnalysisException as e:
           str(e)
       # "[UNRESOLVED_COLUMN.WITHOUT_SUGGESTION] A column, variable, or function parameter with name `a` cannot be resolved.  SQLSTATE: 42703; line 1 pos 7;\n'Project ['a]\n+- OneRowRelation\n"
       ```
   
   - **After**: The `getMessage` API provides streamlined, user-friendly error messages:
       ```python
       from pyspark.sql.utils import AnalysisException
       
       try:
           spark.sql("""SELECT a""")
       except AnalysisException as e:
           e.getMessage()
       # '[UNRESOLVED_COLUMN.WITHOUT_SUGGESTION] A column, variable, or function parameter with name `a` cannot be resolved.  SQLSTATE: 42703'
       ```
   
   
   
   ### How was this patch tested?
   
   Added UTs.
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46360][PYTHON] Enhance error message debugging with new `getMessage` API [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #44292:
URL: https://github.com/apache/spark/pull/44292#discussion_r1422853846


##########
python/pyspark/errors/exceptions/base.py:
##########
@@ -89,13 +91,28 @@ def getSqlState(self) -> Optional[str]:
         See Also
         --------
         :meth:`PySparkException.getErrorClass`
+        :meth:`PySparkException.getMessage`
         :meth:`PySparkException.getMessageParameters`
         """
         return None
 
+    def getMessage(self) -> str:
+        """
+        Returns full error message.
+
+        .. versionadded:: 4.0.0
+
+        See Also
+        --------
+        :meth:`PySparkException.getErrorClass`
+        :meth:`PySparkException.getMessageParameters`
+        :meth:`PySparkException.getSqlState`
+        """
+        return f"[{self.getErrorClass()}] {self._message}"
+
     def __str__(self) -> str:
         if self.getErrorClass() is not None:
-            return f"[{self.getErrorClass()}] {self._message}"

Review Comment:
   Is this the same as JVM side? `Exception.toString` We should probably keep `__str__` as is.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46360][PYTHON] Enhance error message debugging with new `getMessage` API [spark]

Posted by "itholic (via GitHub)" <gi...@apache.org>.
itholic commented on PR #44292:
URL: https://github.com/apache/spark/pull/44292#issuecomment-1849454284

   cc @garlandz-db FYI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46360][PYTHON] Enhance error message debugging with new `getMessage` API [spark]

Posted by "itholic (via GitHub)" <gi...@apache.org>.
itholic commented on code in PR #44292:
URL: https://github.com/apache/spark/pull/44292#discussion_r1423294143


##########
python/pyspark/errors/exceptions/base.py:
##########
@@ -89,13 +91,28 @@ def getSqlState(self) -> Optional[str]:
         See Also
         --------
         :meth:`PySparkException.getErrorClass`
+        :meth:`PySparkException.getMessage`
         :meth:`PySparkException.getMessageParameters`
         """
         return None
 
+    def getMessage(self) -> str:
+        """
+        Returns full error message.
+
+        .. versionadded:: 4.0.0
+
+        See Also
+        --------
+        :meth:`PySparkException.getErrorClass`
+        :meth:`PySparkException.getMessageParameters`
+        :meth:`PySparkException.getSqlState`
+        """
+        return f"[{self.getErrorClass()}] {self._message}"
+
     def __str__(self) -> str:
         if self.getErrorClass() is not None:
-            return f"[{self.getErrorClass()}] {self._message}"

Review Comment:
   Yeah, actually we don't touch the behavior of `__str__` but just arrange the code places here so I believe it would be fine.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46360][PYTHON] Enhance error message debugging with new `getMessage` API [spark]

Posted by "itholic (via GitHub)" <gi...@apache.org>.
itholic commented on code in PR #44292:
URL: https://github.com/apache/spark/pull/44292#discussion_r1423294143


##########
python/pyspark/errors/exceptions/base.py:
##########
@@ -89,13 +91,28 @@ def getSqlState(self) -> Optional[str]:
         See Also
         --------
         :meth:`PySparkException.getErrorClass`
+        :meth:`PySparkException.getMessage`
         :meth:`PySparkException.getMessageParameters`
         """
         return None
 
+    def getMessage(self) -> str:
+        """
+        Returns full error message.
+
+        .. versionadded:: 4.0.0
+
+        See Also
+        --------
+        :meth:`PySparkException.getErrorClass`
+        :meth:`PySparkException.getMessageParameters`
+        :meth:`PySparkException.getSqlState`
+        """
+        return f"[{self.getErrorClass()}] {self._message}"
+
     def __str__(self) -> str:
         if self.getErrorClass() is not None:
-            return f"[{self.getErrorClass()}] {self._message}"

Review Comment:
   Yeah, actually we don't touch the behavior of `__str__` but just rearrange the code places here so I believe it would be fine.



##########
python/pyspark/errors/exceptions/base.py:
##########
@@ -89,13 +91,28 @@ def getSqlState(self) -> Optional[str]:
         See Also
         --------
         :meth:`PySparkException.getErrorClass`
+        :meth:`PySparkException.getMessage`
         :meth:`PySparkException.getMessageParameters`
         """
         return None
 
+    def getMessage(self) -> str:
+        """
+        Returns full error message.
+
+        .. versionadded:: 4.0.0
+
+        See Also
+        --------
+        :meth:`PySparkException.getErrorClass`
+        :meth:`PySparkException.getMessageParameters`
+        :meth:`PySparkException.getSqlState`
+        """
+        return f"[{self.getErrorClass()}] {self._message}"
+
     def __str__(self) -> str:
         if self.getErrorClass() is not None:
-            return f"[{self.getErrorClass()}] {self._message}"

Review Comment:
   Yeah, actually we don't touch the behavior of `__str__` but just rearrange the code here so I believe it would be fine.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46360][PYTHON] Enhance error message debugging with new `getMessage` API [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #44292:
URL: https://github.com/apache/spark/pull/44292#issuecomment-1851305909

   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46360][PYTHON] Enhance error message debugging with new `getMessage` API [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon closed pull request #44292: [SPARK-46360][PYTHON] Enhance error message debugging with new `getMessage` API
URL: https://github.com/apache/spark/pull/44292


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org