You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "FurcyPin (via GitHub)" <gi...@apache.org> on 2023/03/05 16:28:07 UTC

[GitHub] [spark] FurcyPin commented on a diff in pull request #40271: [WIP][SPARK-42258][PYTHON] pyspark.sql.functions should not expose typing.cast

FurcyPin commented on code in PR #40271:
URL: https://github.com/apache/spark/pull/40271#discussion_r1125695676


##########
python/pyspark/sql/functions.py:
##########
@@ -22,20 +22,10 @@
 import sys
 import functools
 import warnings
-from typing import (
-    Any,
-    cast,

Review Comment:
   I agree that the change seemed quite cumbersome. It made me wish Python had some kind of "private import" keyword to handle such cases more easily.
   
   I agree with you that `cast` is the only name that might be confusing (perhaps `overload` too, but all the other names start with an uppercase). The `functions` module feels a little special to me because it is the module I use the most as a Spark user, it's definitely a public API. The 201 other modules don't require such change.
   
   It's out of the scope of this MR, but perhaps for the long term you could consider reorganizing this module [the same way as I did in one of my projects](https://github.com/FurcyPin/spark-frame/blob/main/spark_frame/transformations.py), which had two advantages:
   - the code of each method was isolated in a separate file (that would prevent having a 10 000-lines files)
   - there was no import pollution
   
   For now, I'll do as you suggest: only handle `typing.cast` as a special case and add a unit test to make sure it does not get imported again.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org