You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/08/31 06:01:28 UTC

[GitHub] [spark] zhengruifeng opened a new pull request, #42745: [SPARK-45027][PYTHON] Hide internal functions/variables in `pyspark.sql.functions` from auto-completion

zhengruifeng opened a new pull request, #42745:
URL: https://github.com/apache/spark/pull/42745

   
   ### What changes were proposed in this pull request?
   Hide internal functions/variables in `pyspark.sql.functions` from auto-completion
   
   
   ### Why are the changes needed?
   to hide internal functions/variables which can be confusing, e.g. the internal help functions `to_str`, `get_active_spark_context`
   
   before this PR:
   
   <img width="560" alt="image" src="https://github.com/apache/spark/assets/7322292/ab87d0e8-3ba2-4c71-8c06-aeef939778cf">
   
   <img width="915" alt="image" src="https://github.com/apache/spark/assets/7322292/e138804f-8a7a-4526-9b1a-8338438e14e3">
   
   
   after this PR:
   <img width="562" alt="image" src="https://github.com/apache/spark/assets/7322292/e1710729-cf8f-49d4-b276-4632a88ea5ec">
   
   <img width="774" alt="image" src="https://github.com/apache/spark/assets/7322292/50b8e6f7-9dba-46e6-97f5-5cf8b115bffb">
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   yes
   
   ### How was this patch tested?
   manually check
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   no


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on a diff in pull request #42745: [SPARK-45027][PYTHON] Hide internal functions/variables in `pyspark.sql.functions` from auto-completion

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng commented on code in PR #42745:
URL: https://github.com/apache/spark/pull/42745#discussion_r1317994328


##########
python/pyspark/sql/utils.py:
##########
@@ -179,7 +179,7 @@ def is_remote() -> bool:
     return "SPARK_CONNECT_MODE_ENABLED" in os.environ
 
 
-def try_remote_functions(f: FuncT) -> FuncT:

Review Comment:
   ok



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on a diff in pull request #42745: [SPARK-45027][PYTHON] Hide internal functions/variables in `pyspark.sql.functions` from auto-completion

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng commented on code in PR #42745:
URL: https://github.com/apache/spark/pull/42745#discussion_r1318118585


##########
python/pyspark/sql/functions.py:
##########
@@ -55,11 +55,13 @@
 
 # Keep pandas_udf and PandasUDFType import for backwards compatible import; moved in SPARK-28264
 from pyspark.sql.pandas.functions import pandas_udf, PandasUDFType  # noqa: F401
+
+# TODO: should directly rename the internal functions

Review Comment:
   got it, let me remove it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #42745: [SPARK-45027][PYTHON] Hide internal functions/variables in `pyspark.sql.functions` from auto-completion

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng commented on PR #42745:
URL: https://github.com/apache/spark/pull/42745#issuecomment-1712981391

   thanks, merged to master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #42745: [SPARK-45027][PYTHON] Hide internal functions/variables in `pyspark.sql.functions` from auto-completion

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #42745:
URL: https://github.com/apache/spark/pull/42745#discussion_r1318078679


##########
python/pyspark/sql/functions.py:
##########
@@ -55,11 +55,13 @@
 
 # Keep pandas_udf and PandasUDFType import for backwards compatible import; moved in SPARK-28264
 from pyspark.sql.pandas.functions import pandas_udf, PandasUDFType  # noqa: F401
+
+# TODO: should directly rename the internal functions

Review Comment:
   It's actually fine to don't rename. In practice, we only name underscore when the file is exposed as an API. `pyspark.sql.utils` isn't documented so it's fine. If those methods are used in other (internal) modules, we can just don't add `_` prefix.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng closed pull request #42745: [SPARK-45027][PYTHON] Hide internal functions/variables in `pyspark.sql.functions` from auto-completion

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng closed pull request #42745: [SPARK-45027][PYTHON] Hide internal functions/variables in `pyspark.sql.functions` from auto-completion
URL: https://github.com/apache/spark/pull/42745


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #42745: [SPARK-45027][PYTHON] Hide internal functions/variables in `pyspark.sql.functions` from auto-completion

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #42745:
URL: https://github.com/apache/spark/pull/42745#discussion_r1317991376


##########
python/pyspark/sql/utils.py:
##########
@@ -179,7 +179,7 @@ def is_remote() -> bool:
     return "SPARK_CONNECT_MODE_ENABLED" in os.environ
 
 
-def try_remote_functions(f: FuncT) -> FuncT:

Review Comment:
   you could just leave this as is, and fix `functions.py` when you import with an alias. e.g., `from pyspark.sql.utils import try_remote_functions as _try_remote_functions`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #42745: [SPARK-45027][PYTHON] Hide internal functions/variables in `pyspark.sql.functions` from auto-completion

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng commented on PR #42745:
URL: https://github.com/apache/spark/pull/42745#issuecomment-1709402723

   > I think you can remove `__all__` in this case.
   
   yeah, when we make sure all internal fields starts with `_`, the `__all__` is no longer needed, will update


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org