You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2021/01/27 08:45:00 UTC

[jira] [Assigned] (ARROW-9634) [C++][Python] Restore non-UTC time zones when reading Parquet file that was previously Arrow

     [ https://issues.apache.org/jira/browse/ARROW-9634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joris Van den Bossche reassigned ARROW-9634:
--------------------------------------------

    Assignee: Joris Van den Bossche

> [C++][Python] Restore non-UTC time zones when reading Parquet file that was previously Arrow
> --------------------------------------------------------------------------------------------
>
>                 Key: ARROW-9634
>                 URL: https://issues.apache.org/jira/browse/ARROW-9634
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++, Python
>            Reporter: Wes McKinney
>            Assignee: Joris Van den Bossche
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> This was reported on the mailing list
> {code}
> In [20]: df = pd.DataFrame({'a': pd.Series(np.arange(0, 10000, 1000)).astype(pd.DatetimeTZDtype('ns', 'America/Los_Angeles'
>     ...: ))})                                                                                                              
> In [21]: t = pa.table(df)                                                                                                  
> In [22]: t                                                                                                                 
> Out[22]: 
> pyarrow.Table
> a: timestamp[ns, tz=America/Los_Angeles]
> In [23]: pq.write_table(t, 'test.parquet')                                                                                 
> In [24]: pq.read_table('test.parquet')                                                                                     
> Out[24]: 
> pyarrow.Table
> a: timestamp[us, tz=UTC]
> In [25]: pq.read_table('test.parquet')[0]                                                                                  
> Out[25]: 
> <pyarrow.lib.ChunkedArray object at 0x7f72eb4b68f0>
> [
>   [
>     1970-01-01 00:00:00.000000,
>     1970-01-01 00:00:00.000001,
>     1970-01-01 00:00:00.000002,
>     1970-01-01 00:00:00.000003,
>     1970-01-01 00:00:00.000004,
>     1970-01-01 00:00:00.000005,
>     1970-01-01 00:00:00.000006,
>     1970-01-01 00:00:00.000007,
>     1970-01-01 00:00:00.000008,
>     1970-01-01 00:00:00.000009
>   ]
> ]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)