You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Wes McKinney (Jira)" <ji...@apache.org> on 2020/08/03 16:38:00 UTC

[jira] [Created] (ARROW-9634) [C++][Python] Restore non-UTC time zones when reading Parquet file that was previously Arrow

Wes McKinney created ARROW-9634:
-----------------------------------

             Summary: [C++][Python] Restore non-UTC time zones when reading Parquet file that was previously Arrow
                 Key: ARROW-9634
                 URL: https://issues.apache.org/jira/browse/ARROW-9634
             Project: Apache Arrow
          Issue Type: Bug
          Components: C++, Python
            Reporter: Wes McKinney
             Fix For: 2.0.0


This was reported on the mailing list

{code}
In [20]: df = pd.DataFrame({'a': pd.Series(np.arange(0, 10000, 1000)).astype(pd.DatetimeTZDtype('ns', 'America/Los_Angeles'
    ...: ))})                                                                                                              

In [21]: t = pa.table(df)                                                                                                  

In [22]: t                                                                                                                 
Out[22]: 
pyarrow.Table
a: timestamp[ns, tz=America/Los_Angeles]

In [23]: pq.write_table(t, 'test.parquet')                                                                                 

In [24]: pq.read_table('test.parquet')                                                                                     
Out[24]: 
pyarrow.Table
a: timestamp[us, tz=UTC]

In [25]: pq.read_table('test.parquet')[0]                                                                                  
Out[25]: 
<pyarrow.lib.ChunkedArray object at 0x7f72eb4b68f0>
[
  [
    1970-01-01 00:00:00.000000,
    1970-01-01 00:00:00.000001,
    1970-01-01 00:00:00.000002,
    1970-01-01 00:00:00.000003,
    1970-01-01 00:00:00.000004,
    1970-01-01 00:00:00.000005,
    1970-01-01 00:00:00.000006,
    1970-01-01 00:00:00.000007,
    1970-01-01 00:00:00.000008,
    1970-01-01 00:00:00.000009
  ]
]
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)