You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Xinrong Meng (Jira)" <ji...@apache.org> on 2022/10/10 21:26:00 UTC

[jira] [Resolved] (SPARK-39199) Implement pandas API missing parameters

     [ https://issues.apache.org/jira/browse/SPARK-39199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xinrong Meng resolved SPARK-39199.
----------------------------------
    Resolution: Resolved

> Implement pandas API missing parameters
> ---------------------------------------
>
>                 Key: SPARK-39199
>                 URL: https://issues.apache.org/jira/browse/SPARK-39199
>             Project: Spark
>          Issue Type: Umbrella
>          Components: Pandas API on Spark, PySpark
>    Affects Versions: 3.3.0, 3.4.0, 3.3.1
>            Reporter: Xinrong Meng
>            Assignee: Xinrong Meng
>            Priority: Major
>
> pandas API on Spark aims to make pandas code work on Spark clusters without any changes. So full API coverage has been one of our major goals. Currently, most pandas functions are implemented, whereas some of them are have incomplete parameters support.
> There are some common parameters missing (resolved):
>  * How to do with NAs   
>  * Filter data types    
>  * Control result length    
>  * Reindex result   
> There are remaining missing parameters to implement (see doc below).
> See the design and the current status at [https://docs.google.com/document/d/1H6RXL6oc-v8qLJbwKl6OEqBjRuMZaXcTYmrZb9yNm5I/edit?usp=sharing].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org