You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "ueshin (via GitHub)" <gi...@apache.org> on 2023/03/24 00:17:34 UTC

[GitHub] [spark] ueshin opened a new pull request, #40538: [SPARK-42911][PYTHON] Introduce more basic exceptions

ueshin opened a new pull request, #40538:
URL: https://github.com/apache/spark/pull/40538

   ### What changes were proposed in this pull request?
   
   Introduces more basic exceptions.
   
   - ArithmeticException
   - ArrayIndexOutOfBoundsException
   - DateTimeException
   - NumberFormatException
   - SparkRuntimeException
   
   ### Why are the changes needed?
   
   There are more exceptions that Spark throws but PySpark doesn't capture.
   
   We should introduce more basic exceptions; otherwise we still see `Py4JJavaError` or `SparkConnectGrpcException`.
   
   ```py
   >>> spark.conf.set("spark.sql.ansi.enabled", True)
   >>> spark.sql("select 1/0")
   DataFrame[(1 / 0): double]
   >>> spark.sql("select 1/0").show()
   Traceback (most recent call last):
   ...
   py4j.protocol.Py4JJavaError: An error occurred while calling o44.showString.
   : org.apache.spark.SparkArithmeticException: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
   == SQL(line 1, position 8) ==
   select 1/0
          ^^^
   
   	at org.apache.spark.sql.errors.QueryExecutionErrors$.divideByZeroError(QueryExecutionErrors.scala:225)
   ... JVM's stacktrace
   ```
   
   ```py
   >>> spark.sql("select 1/0").show()
   Traceback (most recent call last):
   ...
   pyspark.errors.exceptions.connect.SparkConnectGrpcException: (org.apache.spark.SparkArithmeticException) [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
   == SQL(line 1, position 8) ==
   select 1/0
          ^^^
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   
   The error message is more readable.
   
   ```py
   >>> spark.sql("select 1/0").show()
   Traceback (most recent call last):
   ...
   pyspark.errors.exceptions.captured.ArithmeticException: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
   == SQL(line 1, position 8) ==
   select 1/0
          ^^^
   ```
   
   or
   
   ```py
   >>> spark.sql("select 1/0").show()
   Traceback (most recent call last):
   ...
   pyspark.errors.exceptions.connect.ArithmeticException: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
   == SQL(line 1, position 8) ==
   select 1/0
          ^^^
   ```
   
   ### How was this patch tested?
   
   Added the related tests.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on pull request #40538: [SPARK-42911][PYTHON] Introduce more basic exceptions

Posted by "itholic (via GitHub)" <gi...@apache.org>.
itholic commented on PR #40538:
URL: https://github.com/apache/spark/pull/40538#issuecomment-1482146445

   Ah I see. One for regular Spark session and the other for remote Spark session.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ueshin commented on pull request #40538: [SPARK-42911][PYTHON] Introduce more basic exceptions

Posted by "ueshin (via GitHub)" <gi...@apache.org>.
ueshin commented on PR #40538:
URL: https://github.com/apache/spark/pull/40538#issuecomment-1483241667

   @HyukjinKwon #40547


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #40538: [SPARK-42911][PYTHON] Introduce more basic exceptions

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #40538:
URL: https://github.com/apache/spark/pull/40538#issuecomment-1482560940

   @ueshin it has a conflict w/ branch-3.4. would you mind creating a backport PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ueshin commented on a diff in pull request #40538: [SPARK-42911][PYTHON] Introduce more basic exceptions

Posted by "ueshin (via GitHub)" <gi...@apache.org>.
ueshin commented on code in PR #40538:
URL: https://github.com/apache/spark/pull/40538#discussion_r1147030723


##########
python/pyspark/errors/exceptions/connect.py:
##########
@@ -119,13 +135,43 @@ class QueryExecutionException(SparkConnectGrpcException, BaseQueryExecutionExcep
     """
 
 
-class SparkUpgradeException(SparkConnectGrpcException, BaseSparkUpgradeException):
+class PythonException(SparkConnectGrpcException, BasePythonException):
     """
-    Exception thrown because of Spark upgrade from Spark Connect
+    Exceptions thrown from Spark Connect server.

Review Comment:
   The comment is from the previous. We can change it to `Spark Connect` while we are here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ueshin commented on pull request #40538: [SPARK-42911][PYTHON] Introduce more basic exceptions

Posted by "ueshin (via GitHub)" <gi...@apache.org>.
ueshin commented on PR #40538:
URL: https://github.com/apache/spark/pull/40538#issuecomment-1482126322

   > the examples in "Does this PR introduce any user-facing change?" are the same??
   
   No, previously we still see `py4j.protocol.Py4JJavaError` or `SparkConnectGrpcException` and now we only see the actual exception classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ueshin commented on pull request #40538: [SPARK-42911][PYTHON] Introduce more basic exceptions

Posted by "ueshin (via GitHub)" <gi...@apache.org>.
ueshin commented on PR #40538:
URL: https://github.com/apache/spark/pull/40538#issuecomment-1482078921

   cc @itholic 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on a diff in pull request #40538: [SPARK-42911][PYTHON] Introduce more basic exceptions

Posted by "itholic (via GitHub)" <gi...@apache.org>.
itholic commented on code in PR #40538:
URL: https://github.com/apache/spark/pull/40538#discussion_r1146999885


##########
python/pyspark/errors/exceptions/connect.py:
##########
@@ -119,13 +135,43 @@ class QueryExecutionException(SparkConnectGrpcException, BaseQueryExecutionExcep
     """
 
 
-class SparkUpgradeException(SparkConnectGrpcException, BaseSparkUpgradeException):
+class PythonException(SparkConnectGrpcException, BasePythonException):
     """
-    Exception thrown because of Spark upgrade from Spark Connect
+    Exceptions thrown from Spark Connect server.

Review Comment:
   qq: Is `Spark Connect server` and `Spark Connect` different??
   
   Only `PythonException` says it's thrown from Spark Connect "server".



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #40538: [SPARK-42911][PYTHON] Introduce more basic exceptions

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #40538:
URL: https://github.com/apache/spark/pull/40538#issuecomment-1482558850

   Merged to master and branch-3.4.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon closed pull request #40538: [SPARK-42911][PYTHON] Introduce more basic exceptions

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon closed pull request #40538: [SPARK-42911][PYTHON] Introduce more basic exceptions
URL: https://github.com/apache/spark/pull/40538


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on pull request #40538: [SPARK-42911][PYTHON] Introduce more basic exceptions

Posted by "itholic (via GitHub)" <gi...@apache.org>.
itholic commented on PR #40538:
URL: https://github.com/apache/spark/pull/40538#issuecomment-1482086635

   btw, the example in "Does this PR introduce any user-facing change?" is the same??


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org