You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean R. Owen (Jira)" <ji...@apache.org> on 2022/08/31 17:54:00 UTC

[jira] [Commented] (SPARK-39895) pyspark drop doesn't accept *cols

    [ https://issues.apache.org/jira/browse/SPARK-39895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17598545#comment-17598545 ] 

Sean R. Owen commented on SPARK-39895:
--------------------------------------

Not a big deal, but the example doesn't make sense to me. It's multiple cols in one string, not multiple strings or cols. Right? that doesn't seem like the right example

> pyspark drop doesn't accept *cols 
> ----------------------------------
>
>                 Key: SPARK-39895
>                 URL: https://issues.apache.org/jira/browse/SPARK-39895
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 3.0.3, 3.3.0, 3.2.2
>            Reporter: Santosh Pingale
>            Assignee: Santosh Pingale
>            Priority: Minor
>             Fix For: 3.4.0
>
>
> Pyspark dataframe drop has following signature:
> {color:#4c9aff}{{def drop(self, *cols: "ColumnOrName") -> "DataFrame":}}{color}
> However when we try to pass multiple Column types to drop function it raises TypeError
> {{each col in the param list should be a string}}
> *Minimal reproducible example:*
> {color:#4c9aff}values = [("id_1", 5, 9), ("id_2", 5, 1), ("id_3", 4, 3), ("id_1", 3, 3), ("id_2", 4, 3)]{color}
> {color:#4c9aff}df = spark.createDataFrame(values, "id string, point int, count int"){color}
> |– id: string (nullable = true)|
> |– point: integer (nullable = true)|
> |– count: integer (nullable = true)|
> {color:#4c9aff}{{df.drop(df.point, df.count)}}{color}
> {quote}{color:#505f79}/spark/python/lib/pyspark.zip/pyspark/sql/dataframe.py in drop(self, *cols){color}
> {color:#505f79}2537 for col in cols:{color}
> {color:#505f79}2538 if not isinstance(col, str):{color}
> {color:#505f79}-> 2539 raise TypeError("each col in the param list should be a string"){color}
> {color:#505f79}2540 jdf = self._jdf.drop(self._jseq(cols)){color}
> {color:#505f79}2541{color}
> {color:#505f79}TypeError: each col in the param list should be a string{color}
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org