You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/08/25 08:31:51 UTC

[GitHub] [spark] HyukjinKwon opened a new pull request, #42679: [SPARK-44964][ML][CONNECT] Clean up pyspark.ml.connect.functions doctest

HyukjinKwon opened a new pull request, #42679:
URL: https://github.com/apache/spark/pull/42679

   ### What changes were proposed in this pull request?
   
   This PR proposes to clean up `pyspark.ml.connect.functions` doctest. All of the tests under that are being skipped.
   
   ### Why are the changes needed?
   
   To remove unused test codes.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No, test-only.
   
   ### How was this patch tested?
   
   Manually ran the tests via:
   
   ```python
   ./python/run-tests --python-executables=python3 --modules=pyspark-ml-connect
   ```
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #42679: [SPARK-44964][ML][CONNECT][TESTS] Clean up pyspark.ml.connect.functions doctest

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng commented on PR #42679:
URL: https://github.com/apache/spark/pull/42679#issuecomment-1693182052

   @HyukjinKwon I think this test is working?
   
   https://github.com/apache/spark/actions/runs/5972917734/job/16204296750
   
   ```
   Starting test(python3.9): pyspark.ml.connect.functions (temp output: /__w/spark/spark/python/target/e19b8ef9-6a72-4ec1-b6e6-5a5674016f99/python3.9__pyspark.ml.connect.functions__1m6vmd3h.log)
   Finished test(python3.9): pyspark.ml.connect.functions (10s)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #42679: [SPARK-44964][ML][CONNECT] Clean up pyspark.ml.connect.functions doctest

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #42679:
URL: https://github.com/apache/spark/pull/42679#discussion_r1305366059


##########
python/pyspark/ml/connect/__init__.py:
##########
@@ -16,6 +16,9 @@
 #
 
 """Spark Connect Python Client - ML module"""
+from pyspark.sql.connect.utils import check_dependencies

Review Comment:
   and we check the required dependency at the top level of module when it's imported.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #42679: [SPARK-44964][ML][CONNECT][TESTS] Clean up pyspark.ml.connect.functions doctest

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #42679:
URL: https://github.com/apache/spark/pull/42679#discussion_r1306317406


##########
python/pyspark/ml/connect/functions.py:
##########
@@ -36,41 +31,3 @@ def array_to_vector(col: Column) -> Column:
 
 
 array_to_vector.__doc__ = PyMLFunctions.array_to_vector.__doc__
-
-
-def _test() -> None:
-    import sys
-    import doctest
-    from pyspark.sql import SparkSession as PySparkSession
-    import pyspark.ml.connect.functions
-
-    globs = pyspark.ml.connect.functions.__dict__.copy()
-
-    # TODO: split vector_to_array doctest since it includes .mllib vectors
-    del pyspark.ml.connect.functions.vector_to_array.__doc__
-
-    # TODO: spark.createDataFrame should support UDT
-    del pyspark.ml.connect.functions.array_to_vector.__doc__

Review Comment:
   that works but none of tests are actually running :-)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #42679: [SPARK-44964][ML][CONNECT][TESTS] Clean up pyspark.ml.connect.functions doctest

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #42679:
URL: https://github.com/apache/spark/pull/42679#issuecomment-1694576767

   Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun closed pull request #42679: [SPARK-44964][ML][CONNECT][TESTS] Clean up pyspark.ml.connect.functions doctest

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun closed pull request #42679: [SPARK-44964][ML][CONNECT][TESTS] Clean up pyspark.ml.connect.functions doctest
URL: https://github.com/apache/spark/pull/42679


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org