You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "rok (via GitHub)" <gi...@apache.org> on 2023/02/15 20:57:51 UTC

[GitHub] [arrow] rok opened a new pull request, #34208: GH-33143: [C++] Kernel to convert timestamp with timezone to wall time

rok opened a new pull request, #34208:
URL: https://github.com/apache/arrow/pull/34208

   ### Rationale for this change
   
   We have `assume_timezone` to go from wall time timestamp to timestamp with a timezone. We might want a reverse operation to go from timestamp with a timezone to wall time.
   
   This is not needed for computation within Arrow, but would be needed if an application outsides of Arrow consumes wall time. E.g.: https://stackoverflow.com/questions/73275465/how-to-keep-original-datatime-in-pyarrow-table/73276431
   
   ### What changes are included in this PR?
   
   This adds a `local_time` C++ kernel.
   
   ### Are these changes tested?
   
   Yes.
   
   ### Are there any user-facing changes?
   
   Yes, `local_time` C++ kernel is added.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #34208: GH-33143: [C++] Kernel to convert timestamp with timezone to wall time

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #34208:
URL: https://github.com/apache/arrow/pull/34208#issuecomment-1432022544

   * Closes: #33143


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #34208: GH-33143: [C++] Kernel to convert timestamp with timezone to wall time

Posted by "rok (via GitHub)" <gi...@apache.org>.
rok commented on PR #34208:
URL: https://github.com/apache/arrow/pull/34208#issuecomment-1435009099

   @github-actions crossbow submit preview-docs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on a diff in pull request #34208: GH-33143: [C++] Kernel to convert timestamp with timezone to wall time

Posted by "rok (via GitHub)" <gi...@apache.org>.
rok commented on code in PR #34208:
URL: https://github.com/apache/arrow/pull/34208#discussion_r1109695004


##########
cpp/src/arrow/compute/api_scalar.h:
##########
@@ -1518,6 +1518,16 @@ ARROW_EXPORT Result<Datum> AssumeTimezone(const Datum& values,
 ARROW_EXPORT Result<Datum> IsDaylightSavings(const Datum& values,
                                              ExecContext* ctx = NULLPTR);
 
+/// \brief LocalTime converts timestamp to timezone naive local timestamp
+///
+/// \param[in] values input to convert to local time
+/// \param[in] ctx the function execution context, optional
+/// \return the resulting datum
+///
+/// \since 12.0.0
+/// \note API not yet finalized

Review Comment:
   Well, you could see it as an opportunity to sync the API with substrait :D



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] ursabot commented on pull request #34208: GH-33143: [C++] Kernel to convert timestamp with timezone to wall time

Posted by "ursabot (via GitHub)" <gi...@apache.org>.
ursabot commented on PR #34208:
URL: https://github.com/apache/arrow/pull/34208#issuecomment-1435526385

   Benchmark runs are scheduled for baseline = ae1f98f57f4bf38e6e38231c8566e6cbaa98d8ac and contender = daa5e267fc478ec016c950b7557bc10acbed3e5e. daa5e267fc478ec016c950b7557bc10acbed3e5e is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/8df438aa3ce04b208d51d64075bc60d0...5f5f01d695a345e68a51356b318f6e1b/)
   [Failed :arrow_down:0.31% :arrow_up:0.03%] [test-mac-arm](https://conbench.ursa.dev/compare/runs/946eb94828ed4a29a9ebfdfdfcc93a7a...bb60907a188c4ccebf03a13461b2d8b0/)
   [Finished :arrow_down:0.77% :arrow_up:0.0%] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/2d0b039fe7b64c61be37a20ab29e912f...1aa3df4e015e4ed08781b3aaaeec9bf0/)
   [Finished :arrow_down:0.54% :arrow_up:0.03%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/5a2617d53983425db28e9e1b595b10d6...7335b41dda3e4b1dafca772150196f6f/)
   Buildkite builds:
   [Finished] [`daa5e267` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/2397)
   [Failed] [`daa5e267` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/2427)
   [Finished] [`daa5e267` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/2395)
   [Finished] [`daa5e267` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/2419)
   [Finished] [`ae1f98f5` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/2396)
   [Finished] [`ae1f98f5` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/2426)
   [Finished] [`ae1f98f5` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/2394)
   [Finished] [`ae1f98f5` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/2418)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace commented on a diff in pull request #34208: GH-33143: [C++] Kernel to convert timestamp with timezone to wall time

Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace commented on code in PR #34208:
URL: https://github.com/apache/arrow/pull/34208#discussion_r1110126490


##########
docs/source/cpp/compute.rst:
##########
@@ -1556,11 +1556,20 @@ Input timestamps are assumed to be relative to the timezone given in
 UTC-relative timestamps with the timezone metadata set to the above value.
 An error is returned if the timestamps already have the timezone metadata set.
 
+`local_time` function converts UTC-relative timestamps to local "timezone-naive"
+timestamp. The timezone is taken from the timezone metadata of the input

Review Comment:
   ```suggestion
   timestamps. The timezone is taken from the timezone metadata of the input
   ```



##########
docs/source/cpp/compute.rst:
##########
@@ -1556,11 +1556,20 @@ Input timestamps are assumed to be relative to the timezone given in
 UTC-relative timestamps with the timezone metadata set to the above value.
 An error is returned if the timestamps already have the timezone metadata set.
 
+`local_time` function converts UTC-relative timestamps to local "timezone-naive"
+timestamp. The timezone is taken from the timezone metadata of the input
+timestamps. This function is the inverse of `assume_timezone`. Please note:
+**all temporal functions already operate on timestamps as if they were in local
+time of metadata provided timezone**. Using `local_time` is only meant to be

Review Comment:
   ```suggestion
   time of the metadata provided timezone**. Using `local_time` is only meant to be
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #34208: GH-33143: [C++] Kernel to convert timestamp with timezone to wall time

Posted by "rok (via GitHub)" <gi...@apache.org>.
rok commented on PR #34208:
URL: https://github.com/apache/arrow/pull/34208#issuecomment-1435009654

   Thanks @westonpace, merged the grammar changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jorisvandenbossche commented on a diff in pull request #34208: GH-33143: [C++] Kernel to convert timestamp with timezone to wall time

Posted by "jorisvandenbossche (via GitHub)" <gi...@apache.org>.
jorisvandenbossche commented on code in PR #34208:
URL: https://github.com/apache/arrow/pull/34208#discussion_r1111919924


##########
cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc:
##########
@@ -1784,6 +1807,13 @@ const FunctionDoc is_dst_doc{
      "An error is returned if the values do not have a defined timezone."),
     {"values"}};
 
+const FunctionDoc local_time_doc{
+    "Convert timestamp to a timezone-naive local time timestamp",
+    ("LocalTime converts a timestamp to a local time of timestamps timezone\n"

Review Comment:
   Something is missing in this sentence? 



##########
cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc:
##########
@@ -1801,8 +1831,9 @@ const FunctionDoc ceil_temporal_doc{
 const FunctionDoc round_temporal_doc{
     "Round temporal values to the nearest multiple of specified time unit",
     ("Null values emit null.\n"
-     "An error is returned if the values have a defined timezone but it\n"
-     "cannot be found in the timezone database."),
+     "If timezone is not given then timezone naive timestamp in UTC are\n"
+     "returned. An error is returned if the values have a defined timezone\n"
+     "but it cannot be found in the timezone database."),

Review Comment:
   Was this change intentional? Or was it meant to be in `local_time_doc`? 
   (and if intentional, why only for round and not ceil/floor?)



##########
cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc:
##########
@@ -1784,6 +1807,13 @@ const FunctionDoc is_dst_doc{
      "An error is returned if the values do not have a defined timezone."),
     {"values"}};
 
+const FunctionDoc local_time_doc{
+    "Convert timestamp to a timezone-naive local time timestamp",
+    ("LocalTime converts a timestamp to a local time of timestamps timezone\n"
+     "and removes timezone metadata. If input is in UTC or doesn't have\n"
+     "timezone metadata, it is returned as is.\n"

Review Comment:
   Returned "as is" is not entirely correct for UTC, as I assume the "UTC" timezone is still removed?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jorisvandenbossche commented on pull request #34208: GH-33143: [C++] Kernel to convert timestamp with timezone to wall time

Posted by "jorisvandenbossche (via GitHub)" <gi...@apache.org>.
jorisvandenbossche commented on PR #34208:
URL: https://github.com/apache/arrow/pull/34208#issuecomment-1436959754

   Nice to see this functionality! 
   
   One comment on the name of the function: given that "local_time" could also be interpreted as returning just the time (a `time32/64()` type) and not both date and time (and theoretically, this could also be a kernel we would want to add at some point), we could maybe use `"local_timestamp"` instead?  
   (I know that some will day that "local timestamp" is a contradictio in terminis, and that it should be "local datetime" instead, but since we don't use "datetime" for any other kernel name and "timestamp" is the name of the type anyway, that seems the best option).
   
   For reference, in Joda they call this `toLocalDateTime` (and they also have a `toLocalDate` and `toLocalTime`), in pandas the naming is confusing (`tz_localize(None)`), and in R's lubridate there is a `local_time` that does _somewhat_ related, except that it returns the difference in seconds instead of new timestamp object.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace commented on pull request #34208: GH-33143: [C++] Kernel to convert timestamp with timezone to wall time

Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace commented on PR #34208:
URL: https://github.com/apache/arrow/pull/34208#issuecomment-1435192066

   The test failures seem unrelated (I've opened a new issue for one which appears to be an intermittent segfault).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace merged pull request #34208: GH-33143: [C++] Kernel to convert timestamp with timezone to wall time

Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace merged PR #34208:
URL: https://github.com/apache/arrow/pull/34208


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #34208: GH-33143: [C++] Kernel to convert timestamp with timezone to wall time

Posted by "rok (via GitHub)" <gi...@apache.org>.
rok commented on PR #34208:
URL: https://github.com/apache/arrow/pull/34208#issuecomment-1437299866

   Thanks for the review @jorisvandenbossche. I've opened a PR to address it: #34263.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #34208: GH-33143: [C++] Kernel to convert timestamp with timezone to wall time

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #34208:
URL: https://github.com/apache/arrow/pull/34208#issuecomment-1435011716

   Revision: cf2eb81466fe9ea50911138aa6dd28e07010ba03
   
   Submitted crossbow builds: [ursacomputing/crossbow @ actions-760f71e893](https://github.com/ursacomputing/crossbow/branches/all?query=actions-760f71e893)
   
   |Task|Status|
   |----|------|
   |preview-docs|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-760f71e893-github-preview-docs)](https://github.com/ursacomputing/crossbow/actions/runs/4206191248/jobs/7299334794)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #34208: GH-33143: [C++] Kernel to convert timestamp with timezone to wall time

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #34208:
URL: https://github.com/apache/arrow/pull/34208#issuecomment-1434539761

   Revision: 115347ee7cbc2efe4ab15477e34afa102e0a6a10
   
   Submitted crossbow builds: [ursacomputing/crossbow @ actions-ffd22963ef](https://github.com/ursacomputing/crossbow/branches/all?query=actions-ffd22963ef)
   
   |Task|Status|
   |----|------|
   |preview-docs|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-ffd22963ef-github-preview-docs)](https://github.com/ursacomputing/crossbow/actions/runs/4203502742/jobs/7292983966)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace commented on a diff in pull request #34208: GH-33143: [C++] Kernel to convert timestamp with timezone to wall time

Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace commented on code in PR #34208:
URL: https://github.com/apache/arrow/pull/34208#discussion_r1109171816


##########
docs/source/cpp/compute.rst:
##########
@@ -1445,6 +1445,8 @@ For timestamps inputs with non-empty timezone, localized timestamp components wi
 +--------------------+------------+-------------------+---------------+----------------------------+-------+
 | is_leap_year       | Unary      | Timestamp, Date   | Boolean       |                            |       |
 +--------------------+------------+-------------------+---------------+----------------------------+-------+
+| local_time         | Unary      | Timestamp         | Timestamp     |                            |       |

Review Comment:
   Can we add a note?  I don't think it will be obvious what this function does.



##########
cpp/src/arrow/compute/api_scalar.h:
##########
@@ -1518,6 +1518,16 @@ ARROW_EXPORT Result<Datum> AssumeTimezone(const Datum& values,
 ARROW_EXPORT Result<Datum> IsDaylightSavings(const Datum& values,
                                              ExecContext* ctx = NULLPTR);
 
+/// \brief LocalTime converts timestamp to timezone naive local timestamp
+///
+/// \param[in] values input to convert to local time
+/// \param[in] ctx the function execution context, optional
+/// \return the resulting datum
+///
+/// \since 12.0.0
+/// \note API not yet finalized

Review Comment:
   I was going to suggest removing this but now I see this is everywhere.  Hmm.... :(  I'm not sure how true this is these days.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on a diff in pull request #34208: GH-33143: [C++] Kernel to convert timestamp with timezone to wall time

Posted by "rok (via GitHub)" <gi...@apache.org>.
rok commented on code in PR #34208:
URL: https://github.com/apache/arrow/pull/34208#discussion_r1109694127


##########
docs/source/cpp/compute.rst:
##########
@@ -1445,6 +1445,8 @@ For timestamps inputs with non-empty timezone, localized timestamp components wi
 +--------------------+------------+-------------------+---------------+----------------------------+-------+
 | is_leap_year       | Unary      | Timestamp, Date   | Boolean       |                            |       |
 +--------------------+------------+-------------------+---------------+----------------------------+-------+
+| local_time         | Unary      | Timestamp         | Timestamp     |                            |       |

Review Comment:
   Good point. Added some text and moved under the `Timezone handling` chapter. I'll trigger a docs preview to make sure I've got the markup right.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #34208: GH-33143: [C++] Kernel to convert timestamp with timezone to wall time

Posted by "rok (via GitHub)" <gi...@apache.org>.
rok commented on PR #34208:
URL: https://github.com/apache/arrow/pull/34208#issuecomment-1434536936

   @github-actions crossbow submit preview-docs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #34208: GH-33143: [C++] Kernel to convert timestamp with timezone to wall time

Posted by "rok (via GitHub)" <gi...@apache.org>.
rok commented on PR #34208:
URL: https://github.com/apache/arrow/pull/34208#issuecomment-1434671828

   Docs preview: http://crossbow.voltrondata.com/pr_docs/34208/cpp/compute.html#timezone-handling


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org