You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2021/09/18 17:16:00 UTC
[jira] [Assigned] (SPARK-36785) Fix ps.DataFrame.isin
[ https://issues.apache.org/jira/browse/SPARK-36785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-36785:
------------------------------------
Assignee: Apache Spark
> Fix ps.DataFrame.isin
> ---------------------
>
> Key: SPARK-36785
> URL: https://issues.apache.org/jira/browse/SPARK-36785
> Project: Spark
> Issue Type: Sub-task
> Components: PySpark
> Affects Versions: 3.3.0
> Reporter: dgd_contributor
> Assignee: Apache Spark
> Priority: Major
>
> {code:python}
> >>> psdf = ps.DataFrame(
> ... {"a": [None, 2, 3, 4, 5, 6, 7, 8, None], "b": [None, 5, None, 3, 2, 1, None, 0, 0], "c": [1, 5, 1, 3, 2, 1, 1, 0, 0]},
> ... )
> >>>
> >>> psdf
> a b c
> 0 NaN NaN 1
> 1 2.0 5.0 5
> 2 3.0 NaN 1
> 3 4.0 3.0 3
> 4 5.0 2.0 2
> 5 6.0 1.0 1
> 6 7.0 NaN 1
> 7 8.0 0.0 0
> 8 NaN 0.0 0
> >>> other = [1, 2, None]
> >>> psdf.isin(other)
> a b c
> 0 None None True
> 1 True None None
> 2 None None True
> 3 None None None
> 4 None True True
> 5 None True True
> 6 None None True
> 7 None None None
> 8 None None None
> >>> psdf.isin(other).dtypes
> a bool
> b bool
> c bool
> dtype: object
> >>> psdf.to_pandas().isin(other).dtypes
> a bool
> b bool
> c bool
> dtype: object
> >>> psdf.to_pandas().isin(other)
> a b c
> 0 False False True
> 1 True False False
> 2 False False True
> 3 False False False
> 4 False True True
> 5 False True True
> 6 False False True
> 7 False False False
> 8 False False False
> >>>
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org