You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "xinrong-meng (via GitHub)" <gi...@apache.org> on 2023/03/23 08:30:06 UTC

[GitHub] [spark] xinrong-meng opened a new pull request, #40534: [][PYTHON] Raise RuntimeError if SparkContext is not initialized when parsing DDL-formatted type strings

xinrong-meng opened a new pull request, #40534:
URL: https://github.com/apache/spark/pull/40534

   ### What changes were proposed in this pull request?
   Raise RuntimeError if SparkContext is not initialized when parsing DDL-formatted type strings
   
   ### Why are the changes needed?
   Error improvement.
   
   
   ### Does this PR introduce _any_ user-facing change?
   When SparkContext is not initialized when parsing DDL-formatted type strings, a RuntimeError with a clear message is raised.
   
   
   ### How was this patch tested?
   Unit test.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xinrong-meng commented on a diff in pull request #40534: [SPARK-42908][PYTHON] Raise RuntimeError when SparkContext is required but not initialized

Posted by "xinrong-meng (via GitHub)" <gi...@apache.org>.
xinrong-meng commented on code in PR #40534:
URL: https://github.com/apache/spark/pull/40534#discussion_r1148751042


##########
python/pyspark/sql/types.py:
##########
@@ -1211,7 +1211,10 @@ def _parse_datatype_string(s: str) -> DataType:
     from pyspark import SparkContext
 
     sc = SparkContext._active_spark_context
-    assert sc is not None

Review Comment:
   Sounds good!
   
   After the changes proposed in the PR, `python/pyspark/sql` has no such assertions.
   ```
    $ git grep 'assert sc is not None' -- python/pyspark/sql
    $ 
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon closed pull request #40534: [SPARK-42908][PYTHON] Raise RuntimeError when SparkContext is required but not initialized

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon closed pull request #40534: [SPARK-42908][PYTHON] Raise RuntimeError when SparkContext is required but not initialized
URL: https://github.com/apache/spark/pull/40534


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #40534: [SPARK-42908][PYTHON] Raise RuntimeError when SparkContext is required but not initialized

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #40534:
URL: https://github.com/apache/spark/pull/40534#discussion_r1149935560


##########
python/pyspark/sql/utils.py:
##########
@@ -193,6 +193,15 @@ def wrapped(*args: Any, **kwargs: Any) -> Any:
     return cast(FuncT, wrapped)
 
 
+def require_spark_context_initialized() -> SparkContext:
+    """Raise RuntimeError if SparkContext is not initialized,
+    otherwise, returns the active SparkContext."""
+    sc = SparkContext._active_spark_context
+    if sc is None or sc._jvm is None:
+        raise RuntimeError("SparkContext must be initialized for JVM access.")

Review Comment:
   I would just say something like: `SparkContext or SparkSession should be created first` instead



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #40534: [SPARK-42908][PYTHON] Raise RuntimeError if SparkContext is not initialized when parsing DDL-formatted type strings

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #40534:
URL: https://github.com/apache/spark/pull/40534#discussion_r1146116074


##########
python/pyspark/sql/types.py:
##########
@@ -1211,7 +1211,10 @@ def _parse_datatype_string(s: str) -> DataType:
     from pyspark import SparkContext
 
     sc = SparkContext._active_spark_context
-    assert sc is not None

Review Comment:
   In fact, we should fix a lot of this places in, e.g., `pyspark.sql.functions.*` too .. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #40534: [SPARK-42908][PYTHON] Raise RuntimeError when SparkContext is required but not initialized

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #40534:
URL: https://github.com/apache/spark/pull/40534#issuecomment-1486246281

   Merged to master and branch-3.4.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xinrong-meng commented on a diff in pull request #40534: [SPARK-42908][PYTHON] Raise RuntimeError when SparkContext is required but not initialized

Posted by "xinrong-meng (via GitHub)" <gi...@apache.org>.
xinrong-meng commented on code in PR #40534:
URL: https://github.com/apache/spark/pull/40534#discussion_r1149944567


##########
python/pyspark/sql/utils.py:
##########
@@ -193,6 +193,15 @@ def wrapped(*args: Any, **kwargs: Any) -> Any:
     return cast(FuncT, wrapped)
 
 
+def require_spark_context_initialized() -> SparkContext:

Review Comment:
   Good point, renamed.



##########
python/pyspark/sql/utils.py:
##########
@@ -193,6 +193,15 @@ def wrapped(*args: Any, **kwargs: Any) -> Any:
     return cast(FuncT, wrapped)
 
 
+def require_spark_context_initialized() -> SparkContext:
+    """Raise RuntimeError if SparkContext is not initialized,
+    otherwise, returns the active SparkContext."""
+    sc = SparkContext._active_spark_context
+    if sc is None or sc._jvm is None:
+        raise RuntimeError("SparkContext must be initialized for JVM access.")

Review Comment:
   Updated.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #40534: [SPARK-42908][PYTHON] Raise RuntimeError when SparkContext is required but not initialized

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #40534:
URL: https://github.com/apache/spark/pull/40534#discussion_r1149934958


##########
python/pyspark/sql/utils.py:
##########
@@ -193,6 +193,15 @@ def wrapped(*args: Any, **kwargs: Any) -> Any:
     return cast(FuncT, wrapped)
 
 
+def require_spark_context_initialized() -> SparkContext:

Review Comment:
   I would name it like: `get_jvm_spark_context`



##########
python/pyspark/sql/utils.py:
##########
@@ -193,6 +193,15 @@ def wrapped(*args: Any, **kwargs: Any) -> Any:
     return cast(FuncT, wrapped)
 
 
+def require_spark_context_initialized() -> SparkContext:

Review Comment:
   I would name it like: `get_active_spark_context`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org