You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Florian Jetter (JIRA)" <ji...@apache.org> on 2019/07/09 15:05:00 UTC
[jira] [Created] (ARROW-5888) [Python][C++] Parquet write metadata
not roundtrip safe for timezone timestamps
Florian Jetter created ARROW-5888:
-------------------------------------
Summary: [Python][C++] Parquet write metadata not roundtrip safe for timezone timestamps
Key: ARROW-5888
URL: https://issues.apache.org/jira/browse/ARROW-5888
Project: Apache Arrow
Issue Type: Bug
Reporter: Florian Jetter
The timezone is not roundtrip safe for timezones other than UTC when storing to parquet. Expected behavior would be that the timezone is properly reconstructed
{code:python}
schema = pa.schema(
[
pa.field("no_tz", pa.timestamp('us')),
pa.field("no_tz", pa.timestamp('us', tz="UTC")),
pa.field("no_tz", pa.timestamp('us', tz="Europe/Berlin")),
]
)
buf = pa.BufferOutputStream()
pq.write_metadata(
schema,
buf,
coerce_timestamps="us"
)
pq_bytes = buf.getvalue().to_pybytes()
reader = pa.BufferReader(pq_bytes)
parquet_file = pq.ParquetFile(reader)
parquet_file.schema.to_arrow_schema()
# Output:
# no_tz: timestamp[us]
# utc: timestamp[us, tz=UTC]
# europe: timestamp[us, tz=UTC]
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)