You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Gaurav Sheni (Jira)" <ji...@apache.org> on 2022/05/11 21:14:00 UTC
[jira] [Created] (ARROW-16540) Support storing different timezone in an array
Gaurav Sheni created ARROW-16540:
------------------------------------
Summary: Support storing different timezone in an array
Key: ARROW-16540
URL: https://issues.apache.org/jira/browse/ARROW-16540
Project: Apache Arrow
Issue Type: New Feature
Components: Format, Python
Reporter: Gaurav Sheni
As a user, I wish I could use pyarrow to store a column of datetimes with different timezones. In certain datasets, it is ideal to a column with mixed timezones (ex - taxi pickups). Even if the data is limited to a single location (let's say a business in NYC for example) over the time span of a single year... then your timezones will be EDT/EST with offsets of -4:00 and -5:00.
Currently, it is not possible to keep a column with different timezones.
{code:java}
import pytz
import pyarrow as pa
import pytz
arr = pa.array([datetime(2010, 1, 1, tzinfo=pytz.timezone('US/Central')), datetime(2015, 1, 1, tzinfo=pytz.timezone('US/Eastern'))])
arr.type
arr[0]
arr[1]
{code}
{code:java}
TimestampType(timestamp[us, tz=US/Central])
<pyarrow.TimestampScalar: datetime.datetime(2014, 12, 31, 18, 0, tzinfo=<DstTzInfo 'US/Central' CST-1 day, 18:00:00 STD>)>
Out[25]: <pyarrow.TimestampScalar: datetime.datetime(2009, 12, 31, 18, 0, tzinfo=<DstTzInfo 'US/Central' CST-1 day, 18:00:00 STD>)>{code}
> Notice how both rows have Central timezone now
--
This message was sent by Atlassian Jira
(v8.20.7#820007)