You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean R. Owen (Jira)" <ji...@apache.org> on 2022/08/31 17:54:00 UTC
[jira] [Commented] (SPARK-39895) pyspark drop doesn't accept *cols
[ https://issues.apache.org/jira/browse/SPARK-39895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17598545#comment-17598545 ]
Sean R. Owen commented on SPARK-39895:
--------------------------------------
Not a big deal, but the example doesn't make sense to me. It's multiple cols in one string, not multiple strings or cols. Right? that doesn't seem like the right example
> pyspark drop doesn't accept *cols
> ----------------------------------
>
> Key: SPARK-39895
> URL: https://issues.apache.org/jira/browse/SPARK-39895
> Project: Spark
> Issue Type: Bug
> Components: PySpark
> Affects Versions: 3.0.3, 3.3.0, 3.2.2
> Reporter: Santosh Pingale
> Assignee: Santosh Pingale
> Priority: Minor
> Fix For: 3.4.0
>
>
> Pyspark dataframe drop has following signature:
> {color:#4c9aff}{{def drop(self, *cols: "ColumnOrName") -> "DataFrame":}}{color}
> However when we try to pass multiple Column types to drop function it raises TypeError
> {{each col in the param list should be a string}}
> *Minimal reproducible example:*
> {color:#4c9aff}values = [("id_1", 5, 9), ("id_2", 5, 1), ("id_3", 4, 3), ("id_1", 3, 3), ("id_2", 4, 3)]{color}
> {color:#4c9aff}df = spark.createDataFrame(values, "id string, point int, count int"){color}
> |– id: string (nullable = true)|
> |– point: integer (nullable = true)|
> |– count: integer (nullable = true)|
> {color:#4c9aff}{{df.drop(df.point, df.count)}}{color}
> {quote}{color:#505f79}/spark/python/lib/pyspark.zip/pyspark/sql/dataframe.py in drop(self, *cols){color}
> {color:#505f79}2537 for col in cols:{color}
> {color:#505f79}2538 if not isinstance(col, str):{color}
> {color:#505f79}-> 2539 raise TypeError("each col in the param list should be a string"){color}
> {color:#505f79}2540 jdf = self._jdf.drop(self._jseq(cols)){color}
> {color:#505f79}2541{color}
> {color:#505f79}TypeError: each col in the param list should be a string{color}
> {quote}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org