You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2023/03/03 17:37:00 UTC
[jira] [Commented] (SPARK-42258) pyspark.sql.functions should not expose typing.cast
[ https://issues.apache.org/jira/browse/SPARK-42258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696237#comment-17696237 ]
Apache Spark commented on SPARK-42258:
--------------------------------------
User 'FurcyPin' has created a pull request for this issue:
https://github.com/apache/spark/pull/40271
> pyspark.sql.functions should not expose typing.cast
> ---------------------------------------------------
>
> Key: SPARK-42258
> URL: https://issues.apache.org/jira/browse/SPARK-42258
> Project: Spark
> Issue Type: Improvement
> Components: PySpark
> Affects Versions: 3.3.1
> Reporter: Furcy Pin
> Priority: Minor
>
> In pyspark, the `pyspark.sql.functions` modules imports and exposes the method `typing.cast`.
> This may lead to errors from users that can be hard to spot.
> *Example*
> It took me a few minutes to understand why the following code:
>
> {code:java}
> from pyspark.sql import SparkSession
> from pyspark.sql import functions as f
> spark = SparkSession.builder.getOrCreate()
> df = spark.sql("""SELECT 1 as a""")
> df.withColumn("a", f.cast("STRING", f.col("a"))).printSchema() {code}
> which executes without any problem, gives the following result:
>
>
> {code:java}
> root
> |-- a: integer (nullable = false){code}
> This is because `f.cast` here calls `typing.cast, and the correct syntax is:
> {code:java}
> df.withColumn("a", f.col("a").cast("STRING")).printSchema(){code}
>
> which indeed gives:
> {code:java}
> root
> |-- a: string (nullable = false) {code}
> *Suggestion of solution*
> Option 1: The methods imported in the module `pyspark.sql.functions` could be obfuscated to prevent this. For instance:
> {code:java}
> from typing import cast as _cast{code}
> Option 2: only import `typing` and replace all occurrences of `cast` with `typing.cast`
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org