You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/11/27 01:04:53 UTC

[PR] [SPARK-46107][PYTHON][ML] Deprecate pyspark.keyword_only API [spark]

HyukjinKwon opened a new pull request, #44023:
URL: https://github.com/apache/spark/pull/44023

   ### What changes were proposed in this pull request?
   
   This PR deprecates `pyspark.keyword_only` API, remove the usage in our codebase.
   
   Note that this PR also removes the docstring that is used for substituting the signature (`autodoc_docstring_signature` feature) in user-facing API documentation. We use keyword-only arguments after SPARK-32933 so we do not need to substitute them anymore, see also https://github.com/sphinx-doc/sphinx/issues/5142.
   
   ### Why are the changes needed?
   
   Initially `pyspark.keyword_only` was added to maintain the Python compatibility between Python 2 and Python 3 (see [PEP-3102](https://peps.python.org/pep-3102/)). Those Python 2-style arguments were removed in SPARK-32933 so we do not need this decorator anymore.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes, it deprecates `pyspark.keyword_only`.
   
   ### How was this patch tested?
   
   Manually checked the documentation build, and linter. Existing test cases should validate them.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46107][PYTHON][ML] Deprecate pyspark.keyword_only API [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon closed pull request #44023: [SPARK-46107][PYTHON][ML] Deprecate pyspark.keyword_only API
URL: https://github.com/apache/spark/pull/44023


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46107][PYTHON][ML] Deprecate pyspark.keyword_only API [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #44023:
URL: https://github.com/apache/spark/pull/44023#issuecomment-1827552401

   Alrigjt, I don't think I can make this working without the decorator because the fact that the argument is given matters (vs default, see also https://github.com/apache/spark/commit/cb43bbe13606673349511829fd71d1f34fc39c45). I am dropping this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46107][PYTHON][ML] Deprecate pyspark.keyword_only API [spark]

Posted by "WeichenXu123 (via GitHub)" <gi...@apache.org>.
WeichenXu123 commented on code in PR #44023:
URL: https://github.com/apache/spark/pull/44023#discussion_r1405554054


##########
python/pyspark/ml/classification.py:
##########
@@ -1275,7 +1263,6 @@ def __init__(
     ):
         ...
 
-    @keyword_only

Review Comment:
   I don't think so ? It saves params to `self._input_kwargs` and we use `self._input_kwargs` in following code



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46107][PYTHON][ML] Deprecate pyspark.keyword_only API [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #44023:
URL: https://github.com/apache/spark/pull/44023#discussion_r1405778545


##########
python/pyspark/ml/clustering.py:
##########
@@ -448,14 +441,12 @@ def setParams(
         weightCol: Optional[str] = None,
     ) -> "GaussianMixture":
         """
-        setParams(self, \\*, featuresCol="features", predictionCol="prediction", k=2, \

Review Comment:
   ![Screenshot 2023-11-27 at 5 05 53 PM](https://github.com/apache/spark/assets/6477701/9099f1c5-a1fa-4a71-8595-06f101f28d7f)
   



##########
python/pyspark/ml/clustering.py:
##########
@@ -417,22 +414,18 @@ def __init__(
         aggregationDepth: int = 2,
         weightCol: Optional[str] = None,
     ):
-        """
-        __init__(self, \\*, featuresCol="features", predictionCol="prediction", k=2, \
-                 probabilityCol="probability", tol=0.01, maxIter=100, seed=None, \
-                 aggregationDepth=2, weightCol=None)
-        """

Review Comment:
   Manually checked:
   ![Screenshot 2023-11-27 at 5 05 57 PM](https://github.com/apache/spark/assets/6477701/ed3f4708-1959-4659-b0dd-e76ff3cc67e6)
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46107][PYTHON][ML] Deprecate pyspark.keyword_only API [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #44023:
URL: https://github.com/apache/spark/pull/44023#discussion_r1405556038


##########
python/pyspark/ml/classification.py:
##########
@@ -1275,7 +1263,6 @@ def __init__(
     ):
         ...
 
-    @keyword_only

Review Comment:
   We should manually store them, or use a differnet decorator then. Let me fix it up.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46107][PYTHON][ML] Deprecate pyspark.keyword_only API [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #44023:
URL: https://github.com/apache/spark/pull/44023#issuecomment-1827067123

   cc @zero323 too


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46107][PYTHON][ML] Deprecate pyspark.keyword_only API [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #44023:
URL: https://github.com/apache/spark/pull/44023#issuecomment-1826980150

   cc @mengxr @WeichenXu123 FYI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org