You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/10/04 06:17:08 UTC

[PR] [SPARK-45396][PYTHON[[DOCS] Add doc entry for `pyspark.ml.connect` module [spark]

HyukjinKwon opened a new pull request, #43210:
URL: https://github.com/apache/spark/pull/43210

   ### What changes were proposed in this pull request?
   
   This PR documents MLlib's Spark Connect support at API reference.
   
   ### Why are the changes needed?
   
   With this this, user cannot see `pyspark.ml.connect` Python APIs on doc website.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes it adds the new page into your facing documentation ([PySpark API reference](https://spark.apache.org/docs/latest/api/python/reference/index.html)).
   
   
   ### How was this patch tested?
   
   Manually tested via:
   
   ```bash
   cd python/docs
   make clean html
   ```
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45396][PYTHON][DOCS] Add doc entry for `pyspark.ml.connect` module [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #43210:
URL: https://github.com/apache/spark/pull/43210#discussion_r1345289072


##########
python/pyspark/ml/connect/__init__.py:
##########
@@ -31,13 +31,14 @@
     evaluation,
     tuning,
 )
+from pyspark.ml.connect.evaluation import Evaluator
 
 from pyspark.ml.connect.pipeline import Pipeline, PipelineModel
 
 __all__ = [
     "Estimator",
     "Transformer",
-    "Estimator",
+    "Evaluator",

Review Comment:
   Shall we revisit the PR title because this touch the main `ml/connect` code to change the exported symbols?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45396][PYTHON] Add doc entry for `pyspark.ml.connect` module, and adds `Evaluator` to `__all__` at `ml.connect` [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon closed pull request #43210: [SPARK-45396][PYTHON] Add doc entry for `pyspark.ml.connect` module, and adds `Evaluator` to `__all__` at `ml.connect` 
URL: https://github.com/apache/spark/pull/43210


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45396][PYTHON][DOCS] Add doc entry for `pyspark.ml.connect` module [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #43210:
URL: https://github.com/apache/spark/pull/43210#discussion_r1345289893


##########
python/pyspark/ml/connect/__init__.py:
##########
@@ -31,13 +31,14 @@
     evaluation,
     tuning,
 )
+from pyspark.ml.connect.evaluation import Evaluator
 
 from pyspark.ml.connect.pipeline import Pipeline, PipelineModel
 
 __all__ = [
     "Estimator",
     "Transformer",
-    "Estimator",
+    "Evaluator",

Review Comment:
   Sure.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45396][PYTHON] Add doc entry for `pyspark.ml.connect` module, and adds `Evaluator` to `__all__` at `ml.connect` [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #43210:
URL: https://github.com/apache/spark/pull/43210#issuecomment-1945142540

   Reverted at https://github.com/apache/spark/commit/ea6b25767fb86732c108c759fd5393caee22f129 in branch-3.5


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45396][PYTHON][DOCS] Add doc entry for `pyspark.ml.connect` module [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #43210:
URL: https://github.com/apache/spark/pull/43210#discussion_r1345287330


##########
python/pyspark/ml/connect/__init__.py:
##########
@@ -31,13 +31,14 @@
     evaluation,
     tuning,
 )
+from pyspark.ml.connect.evaluation import Evaluator
 
 from pyspark.ml.connect.pipeline import Pipeline, PipelineModel
 
 __all__ = [
     "Estimator",
     "Transformer",
-    "Estimator",
+    "Evaluator",

Review Comment:
   Oh, is this a bug fix?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45396][PYTHON] Add doc entry for `pyspark.ml.connect` module, and adds `Evaluator` to `__all__` at `ml.connect` [spark]

Posted by "HeartSaVioR (via GitHub)" <gi...@apache.org>.
HeartSaVioR commented on PR #43210:
URL: https://github.com/apache/spark/pull/43210#issuecomment-1942967890

   The error message I've seen was following:
   
   ```
   [autosummary] failed to import 'pyspark.ml.connect.classification.LogisticRegression': no module named pyspark.ml.connect.classification.LogisticRegression
   ```
   
   But adding modules to ALL in `__init__.py` did not seem to work as I expected. Maybe I'm missing something.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45396][PYTHON] Add doc entry for `pyspark.ml.connect` module, and adds `Evaluator` to `__all__` at `ml.connect` [spark]

Posted by "WeichenXu123 (via GitHub)" <gi...@apache.org>.
WeichenXu123 commented on PR #43210:
URL: https://github.com/apache/spark/pull/43210#issuecomment-1748649649

   @HyukjinKwon Could you backport this to spark 3.5 branch ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45396][PYTHON] Add doc entry for `pyspark.ml.connect` module, and adds `Evaluator` to `__all__` at `ml.connect` [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #43210:
URL: https://github.com/apache/spark/pull/43210#issuecomment-1747793178

   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45396][PYTHON] Add doc entry for `pyspark.ml.connect` module, and adds `Evaluator` to `__all__` at `ml.connect` [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #43210:
URL: https://github.com/apache/spark/pull/43210#issuecomment-1749892183

   Merged to branch-3.5 too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45396][PYTHON][DOCS] Add doc entry for `pyspark.ml.connect` module [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #43210:
URL: https://github.com/apache/spark/pull/43210#discussion_r1345288697


##########
python/pyspark/ml/connect/__init__.py:
##########
@@ -31,13 +31,14 @@
     evaluation,
     tuning,
 )
+from pyspark.ml.connect.evaluation import Evaluator
 
 from pyspark.ml.connect.pipeline import Pipeline, PipelineModel
 
 __all__ = [
     "Estimator",
     "Transformer",
-    "Estimator",
+    "Evaluator",

Review Comment:
   Yes .. ish .. previously `from pyspark.sql.commect import Evaluator` did not work. Now it works ..



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45396][PYTHON] Add doc entry for `pyspark.ml.connect` module, and adds `Evaluator` to `__all__` at `ml.connect` [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #43210:
URL: https://github.com/apache/spark/pull/43210#issuecomment-1945132337

   Let's revert it in branch-3.5, and fix it again. It's not critical bug.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45396][PYTHON] Add doc entry for `pyspark.ml.connect` module, and adds `Evaluator` to `__all__` at `ml.connect` [spark]

Posted by "HeartSaVioR (via GitHub)" <gi...@apache.org>.
HeartSaVioR commented on PR #43210:
URL: https://github.com/apache/spark/pull/43210#issuecomment-1942961137

   It seems like pyspark docs build is failing due to this - during running release script against branch-3.5. I can see the docs build pass after reverting this commit. 
   It's really odd as it has been passing in Github Action - I checked with commits in branch-3.5. I suspect the difference may come from different python/apt library versioning (docker container for release), but I have no clear idea about this.
   
   @HyukjinKwon @WeichenXu123 What'd be the better way to move forward? Shall I revert this for branch-3.5, or could someone help looking at this one and make the change be fast forward?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org