You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "itholic (via GitHub)" <gi...@apache.org> on 2023/03/16 15:13:46 UTC

[GitHub] [spark] itholic opened a new pull request, #40458: [SPARK-42824][CONNECT][PYTHON] Provide a clear error message for unsupported JVM attributes.

itholic opened a new pull request, #40458:
URL: https://github.com/apache/spark/pull/40458

   ### What changes were proposed in this pull request?
   
   This pull request proposes an improvement to the error message when trying to access a JVM attribute that is not supported in Spark Connect. Specifically, it adds a more informative error message that clearly indicates which attribute is not supported due to Spark Connect's lack of dependency on the JVM.
   
   ### Why are the changes needed?
   
   Currently, when attempting to access an unsupported JVM attribute in Spark Connect, the error message is not very clear, making it difficult for users to understand the root cause of the issue. This improvement aims to provide more helpful information to users to address this problem as below:
   
   **Before**
   ```python
   >>> spark._jsc
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   AttributeError: 'SparkSession' object has no attribute '_jsc'
   ```
   
   **After**
   ```python
   >>> spark._jsc
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File "/Users/haejoon.lee/Desktop/git_store/spark/python/pyspark/sql/connect/session.py", line 490, in _jsc
       raise PySparkAttributeError(
   pyspark.errors.exceptions.base.PySparkAttributeError: [JVM_ATTRIBUTE_NOT_SUPPORTED] Attribute `_jsc` is not supported in Spark Connect as it depends on the JVM.
   ```
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   This PR does not introduce any user-facing change in terms of functionality. However, it improves the error message, which could potentially affect the user experience in a positive way.
   
   ### How was this patch tested?
   
   This patch was tested by adding new unit tests that specifically target the error message related to unsupported JVM attributes. The tests were run locally on a development environment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gatorsmile commented on a diff in pull request #40458: [SPARK-42824][CONNECT][PYTHON] Provide a clear error message for unsupported JVM attributes

Posted by "gatorsmile (via GitHub)" <gi...@apache.org>.
gatorsmile commented on code in PR #40458:
URL: https://github.com/apache/spark/pull/40458#discussion_r1139537952


##########
python/pyspark/errors/error_classes.py:
##########
@@ -39,6 +39,11 @@
       "Function `<func_name>` should return Column, got <return_type>."
     ]
   },
+  "JVM_ATTRIBUTE_NOT_SUPPORTED" : {
+    "message" : [
+      "Attribute `<attr_name>` is not supported in Spark Connect as it depends on the JVM. If you need to use this attribute, do not use Spark Connect when creating your session."

Review Comment:
   LGTM
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon closed pull request #40458: [SPARK-42824][CONNECT][PYTHON] Provide a clear error message for unsupported JVM attributes

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon closed pull request #40458: [SPARK-42824][CONNECT][PYTHON] Provide a clear error message for unsupported JVM attributes
URL: https://github.com/apache/spark/pull/40458


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on a diff in pull request #40458: [SPARK-42824][CONNECT][PYTHON] Provide a clear error message for unsupported JVM attributes.

Posted by "itholic (via GitHub)" <gi...@apache.org>.
itholic commented on code in PR #40458:
URL: https://github.com/apache/spark/pull/40458#discussion_r1138872667


##########
python/pyspark/errors/error_classes.py:
##########
@@ -39,6 +39,11 @@
       "Function `<func_name>` should return Column, got <return_type>."
     ]
   },
+  "JVM_ATTRIBUTE_NOT_SUPPORTED" : {
+    "message" : [
+      "Attribute `<attr_name>` is not supported in Spark Connect as it depends on the JVM. If you need to use this attribute, use the original PySpark instead of Spark Connect."

Review Comment:
   Here, I'm not completely sure if "original" is the best term to use. Please let me know if someone have any better suggestions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on pull request #40458: [SPARK-42824][CONNECT][PYTHON] Provide a clear error message for unsupported JVM attributes.

Posted by "itholic (via GitHub)" <gi...@apache.org>.
itholic commented on PR #40458:
URL: https://github.com/apache/spark/pull/40458#issuecomment-1472234887

   > [JVM_ATTRIBUTE_NOT_SUPPORTED] Attribute _jsc is not supported in Spark Connect as it depends on the JVM. If you need to use this attribute, do not use Spark Connect when creating your session.
   
   Sounds good. Just applied the comments on error message.
   Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #40458: [SPARK-42824][CONNECT][PYTHON] Provide a clear error message for unsupported JVM attributes.

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #40458:
URL: https://github.com/apache/spark/pull/40458#discussion_r1139522726


##########
python/pyspark/errors/error_classes.py:
##########
@@ -39,6 +39,11 @@
       "Function `<func_name>` should return Column, got <return_type>."
     ]
   },
+  "JVM_ATTRIBUTE_NOT_SUPPORTED" : {
+    "message" : [
+      "Attribute `<attr_name>` is not supported in Spark Connect as it depends on the JVM. If you need to use this attribute, do not use Spark Connect when creating your session."

Review Comment:
   cc @gatorsmile @grundprinzip do you have any suggestion on this error message?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] allanf-db commented on pull request #40458: [SPARK-42824][CONNECT][PYTHON] Provide a clear error message for unsupported JVM attributes.

Posted by "allanf-db (via GitHub)" <gi...@apache.org>.
allanf-db commented on PR #40458:
URL: https://github.com/apache/spark/pull/40458#issuecomment-1472225271

   Instead of proposing that the user uses another PySpark version, I think it's better to suggest that the user creates a Spark Driver session instead of a Spark Connect session.
   
   So instead of:
   [JVM_ATTRIBUTE_NOT_SUPPORTED] Attribute `_jsc` is not supported in Spark Connect as it depends on the JVM. If you need to use this attribute, use the original PySpark instead of Spark Connect.
   
   Perhaps something more like:
   [JVM_ATTRIBUTE_NOT_SUPPORTED] Attribute `_jsc` is not supported in Spark Connect as it depends on the JVM. If you need to use this attribute, do not use Spark Connect when creating your session.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #40458: [SPARK-42824][CONNECT][PYTHON] Provide a clear error message for unsupported JVM attributes

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #40458:
URL: https://github.com/apache/spark/pull/40458#issuecomment-1473012048

   Merged to master and branch-3.4.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org