You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/09/20 09:07:03 UTC

[GitHub] [spark] HyukjinKwon opened a new pull request, #43013: [MINOR][DOCS][CONNECT] Update notes about supported modules in PySpark API reference

HyukjinKwon opened a new pull request, #43013:
URL: https://github.com/apache/spark/pull/43013

   ### What changes were proposed in this pull request?
   
   This PR proposes to add a couple of notes about which modules are supported by Spark Connect.
   
   ### Why are the changes needed?
   
   In order for users to explicitly know which ones are supported in PySpark with Spark Connect.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes, this exposes some notes for PySpark API with Spark Connect.
   
   ### How was this patch tested?
   
   Manually built the site, and checked.
   
   ![Screenshot 2023-09-20 at 6 05 54 PM](https://github.com/apache/spark/assets/6477701/64355af1-f8b2-46bf-8b2a-3ea519995272)
   
   ![Screenshot 2023-09-20 at 6 05 28 PM](https://github.com/apache/spark/assets/6477701/c6f2af70-ec6a-4644-a848-30eedd4f56cf)
   
   ![Screenshot 2023-09-20 at 6 05 32 PM](https://github.com/apache/spark/assets/6477701/b701f5c0-477d-4f7c-9c02-96c307bf14e2)
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #43013: [MINOR][DOCS][CONNECT] Update notes about supported modules in PySpark API reference

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #43013:
URL: https://github.com/apache/spark/pull/43013#issuecomment-1728597889

   Yeah. We should probably document `pyspark.ml.connect` separately but there haven't been examples out yet. Let's document that separately.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #43013: [MINOR][DOCS][CONNECT] Update notes about supported modules in PySpark API reference

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #43013:
URL: https://github.com/apache/spark/pull/43013#issuecomment-1728598145

   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #43013: [MINOR][DOCS][CONNECT] Update notes about supported modules in PySpark API reference

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #43013:
URL: https://github.com/apache/spark/pull/43013#issuecomment-1727296619

   Build: https://github.com/HyukjinKwon/spark/actions/runs/6245820107/job/16955248336


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #43013: [MINOR][DOCS][CONNECT] Update notes about supported modules in PySpark API reference

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng commented on PR #43013:
URL: https://github.com/apache/spark/pull/43013#issuecomment-1727396313

   `/mllib` in scala, `pyspark.ml` and `pyspark.mllib` in python, don't work on connect.
   
   only new module `pyspark.ml.connect` works on connect.
   
   `pyspark.ml` contains many classification/clustering/etc, they are not supported in `pyspark.ml.connect`
   
   cc @WeichenXu123 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon closed pull request #43013: [MINOR][DOCS][CONNECT] Update notes about supported modules in PySpark API reference

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon closed pull request #43013: [MINOR][DOCS][CONNECT] Update notes about supported modules in PySpark API reference
URL: https://github.com/apache/spark/pull/43013


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #43013: [MINOR][DOCS][CONNECT] Update notes about supported modules in PySpark API reference

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng commented on PR #43013:
URL: https://github.com/apache/spark/pull/43013#issuecomment-1727400614

   `pyspark.ml.connect` only supports a small subset of `pyspark.ml`
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org