You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by "ueshin (via GitHub)" <gi...@apache.org> on 2023/03/14 07:42:04 UTC

[GitHub] [spark] ueshin commented on pull request #40388: [SPARK-42765][CONNECT][PYTHON] Enable importing `pandas_udf` from `pyspark.sql.connect.functions`

ueshin commented on PR #40388:
URL: https://github.com/apache/spark/pull/40388#issuecomment-1467557107

   > The unit tests for pandas_udf are in `python/pyspark/sql/tests/connect/test_parity_pandas_udf.py` if that's your concern.
   
   It's not my concern.
   `sql.functions.pandas_udf` and `sql.connect.functions.pandas_udf` should be usable separately if `PYSPARK_NO_NAMESPACE_SHARE` is set.
   
   > Considering`PYSPARK_NO_NAMESPACE_SHARE` is not a heavily-used user-facing env variable, shall we merge the change proposed in this PR first for clarity? Otherwise, I am afraid that users think pandas_udf is not supported in Connect. I will work on a follow-up to support `sql.connect.functions.pandas_udf` with `PYSPARK_NO_NAMESPACE_SHARE` support.
   
   I'd leave the decision to @HyukjinKwon or @zhengruifeng then.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org