You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Allison Wang (Jira)" <ji...@apache.org> on 2023/10/10 18:21:00 UTC

[jira] [Updated] (SPARK-44076) SPIP: Python Data Source API

     [ https://issues.apache.org/jira/browse/SPARK-44076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Allison Wang updated SPARK-44076:
---------------------------------
    Affects Version/s: 4.0.0
                           (was: 3.5.0)

> SPIP: Python Data Source API
> ----------------------------
>
>                 Key: SPARK-44076
>                 URL: https://issues.apache.org/jira/browse/SPARK-44076
>             Project: Spark
>          Issue Type: New Feature
>          Components: PySpark
>    Affects Versions: 4.0.0
>            Reporter: Allison Wang
>            Priority: Major
>
> This proposal aims to introduce a simple API in Python for Data Sources. The idea is to enable Python developers to create data sources without having to learn Scala or deal with the complexities of the current data source APIs. The goal is to make a Python-based API that is simple and easy to use, thus making Spark more accessible to the wider Python developer community. This proposed approach is based on the recently introduced Python user-defined table functions (SPARK-43797) with extensions to support data sources.
> {*}SPIP{*}: [https://docs.google.com/document/d/1oYrCKEKHzznljYfJO4kx5K_Npcgt1Slyfph3NEk7JRU/edit?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org