You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Bryan Cutler (Jira)" <ji...@apache.org> on 2020/01/26 23:22:00 UTC

[jira] [Resolved] (SPARK-30640) Prevent unnessary copies of data in Arrow to Pandas conversion with Timestamps

     [ https://issues.apache.org/jira/browse/SPARK-30640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Cutler resolved SPARK-30640.
----------------------------------
    Fix Version/s: 3.0.0
       Resolution: Fixed

Issue resolved by pull request 27358
[https://github.com/apache/spark/pull/27358]

> Prevent unnessary copies of data in Arrow to Pandas conversion with Timestamps
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-30640
>                 URL: https://issues.apache.org/jira/browse/SPARK-30640
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark, SQL
>    Affects Versions: 2.4.4
>            Reporter: Bryan Cutler
>            Assignee: Bryan Cutler
>            Priority: Major
>             Fix For: 3.0.0
>
>
> During conversion of Arrow to Pandas, timestamp columns are modified to localize for the current timezone. If there are no timestamp columns, this can sometimes result in unnecessary copies of the data. See [https://www.mail-archive.com/dev@arrow.apache.org/msg17008.html] for discussion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org