You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Li Jin (JIRA)" <ji...@apache.org> on 2017/10/06 15:26:00 UTC

[jira] [Commented] (SPARK-22216) Improving PySpark/Pandas interoperability

    [ https://issues.apache.org/jira/browse/SPARK-22216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16194739#comment-16194739 ] 

Li Jin commented on SPARK-22216:
--------------------------------

cc [~cloud_fan]  [~rxin]

I took the liberty and created this master Jira all Arrow-related effect. Is this the format we want? If this looks good I will start linked all related issue to this. 

> Improving PySpark/Pandas interoperability
> -----------------------------------------
>
>                 Key: SPARK-22216
>                 URL: https://issues.apache.org/jira/browse/SPARK-22216
>             Project: Spark
>          Issue Type: Umbrella
>          Components: PySpark
>    Affects Versions: 2.2.0
>            Reporter: Li Jin
>
> This is an umbrella ticket tracking the general effect of improving performance and interoperability between PySpark and Pandas. The core idea is to Apache Arrow as serialization format to reduce the overhead between PySpark and Pandas.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org