You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Antoine Pitrou (Jira)" <ji...@apache.org> on 2021/06/10 12:11:00 UTC

[jira] [Commented] (ARROW-13033) [C++] Kernel to localize naive timestamps to a timezone (preserving clock-time)

    [ https://issues.apache.org/jira/browse/ARROW-13033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17360829#comment-17360829 ] 

Antoine Pitrou commented on ARROW-13033:
----------------------------------------

Hmm, I think there's a misunderstanding. The timestamp value is always UTC. The timezone is only supposed to govern how the timezone is displayed to the user.

{code}
  /// The time zone is a string indicating the name of a time zone, one of:
  ///
  /// * As used in the Olson time zone database (the "tz database" or
  ///   "tzdata"), such as "America/New_York"
  /// * An absolute time zone offset of the form +XX:XX or -XX:XX, such as +07:30
  ///
  /// Whether a timezone string is present indicates different semantics about
  /// the data:
  ///
  /// * If the time zone is null or equal to an empty string, the data is "time
  ///   zone naive" and shall be displayed *as is* to the user, not localized
  ///   to the locale of the user. This data can be though of as UTC but
  ///   without having "UTC" as the time zone, it is not considered to be
  ///   localized to any time zone
  ///
  /// * If the time zone is set to a valid value, values can be displayed as
  ///   "localized" to that time zone, even though the underlying 64-bit
  ///   integers are identical to the same data stored in UTC. Converting
  ///   between time zones is a metadata-only operation and does not change the
  ///   underlying values
{code}

> [C++] Kernel to localize naive timestamps to a timezone (preserving clock-time)
> -------------------------------------------------------------------------------
>
>                 Key: ARROW-13033
>                 URL: https://issues.apache.org/jira/browse/ARROW-13033
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Joris Van den Bossche
>            Priority: Major
>
> Given a tz-naive timestamp, "localize" would interpret that timestamp as local in a given timezone, and return a tz-aware timestamp keeping the same "clock time" (the same year/month/day/hour/etc in the printed representation). Under the hood this converts the timestamp value from that timezone to UTC, since tz-aware timestamps are stored as UTC.
> References: [tz_localize|https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DatetimeIndex.tz_localize.html] in pandas, or [force_tz|https://lubridate.tidyverse.org/reference/force_tz.html] in R's lubridate package
> This will (eventually) also have to deal with ambiguous or non-existing times.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)