You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "AlvinJ15 (via GitHub)" <gi...@apache.org> on 2023/03/30 17:51:35 UTC

[GitHub] [arrow] AlvinJ15 opened a new pull request, #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

AlvinJ15 opened a new pull request, #12528:
URL: https://github.com/apache/arrow/pull/12528

   Temporal floor/ceil/round handle ambiguous/nonexistent local time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r871475775


##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,260 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(25, CalendarUnit::MINUTE);
+  const char* times = R"([
+    "2022-09-21 18:09:00", "2022-09-21 18:10:00", "2022-09-21 18:11:00",
+    "2022-09-21 18:19:00", "2022-09-21 18:20:00", "2022-09-21 18:21:00",
+    "2022-09-21 18:44:00", "2022-09-21 18:45:00", "2022-09-21 18:46:00",
+    "2022-09-21 19:09:00", "2022-09-21 19:10:00", "2022-09-21 19:11:00",
+    "2022-09-21 19:24:00", "2022-09-21 19:25:00", "2022-09-21 19:26:00",
+    "2022-09-21 19:34:00", "2022-09-21 19:35:00", "2022-09-21 19:36:00",
+    "2022-09-21 19:59:00", "2022-09-21 20:00:00", "2022-09-21 20:01:00",
+    "2022-09-21 20:24:00", "2022-09-21 20:25:00", "2022-09-21 20:26:00",
+    "2022-09-21 20:49:00", "2022-09-21 20:50:00", "2022-09-21 20:51:00"])";
+  const char* times_ceil = R"([
+    "2022-09-21 18:10:00", "2022-09-21 18:10:00", "2022-09-21 18:20:00",
+    "2022-09-21 18:20:00", "2022-09-21 18:20:00", "2022-09-21 18:45:00",
+    "2022-09-21 18:45:00", "2022-09-21 18:45:00", "2022-09-21 19:10:00",
+    "2022-09-21 19:10:00", "2022-09-21 19:10:00", "2022-09-21 19:35:00",
+    "2022-09-21 19:35:00", "2022-09-21 19:35:00", "2022-09-21 19:35:00",
+    "2022-09-21 19:35:00", "2022-09-21 19:35:00", "2022-09-21 20:00:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:00:00", "2022-09-21 20:25:00",
+    "2022-09-21 20:25:00", "2022-09-21 20:25:00", "2022-09-21 20:50:00",
+    "2022-09-21 20:50:00", "2022-09-21 20:50:00", "2022-09-21 21:15:00"])";
+  const char* times_floor = R"([
+    "2022-09-21 17:45:00", "2022-09-21 18:10:00", "2022-09-21 18:10:00",
+    "2022-09-21 18:10:00", "2022-09-21 18:10:00", "2022-09-21 18:10:00",
+    "2022-09-21 18:20:00", "2022-09-21 18:45:00", "2022-09-21 18:45:00",
+    "2022-09-21 18:45:00", "2022-09-21 19:10:00", "2022-09-21 19:10:00",
+    "2022-09-21 19:10:00", "2022-09-21 19:10:00", "2022-09-21 19:10:00",
+    "2022-09-21 19:10:00", "2022-09-21 19:35:00", "2022-09-21 19:35:00",
+    "2022-09-21 19:35:00", "2022-09-21 20:00:00", "2022-09-21 20:00:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:25:00", "2022-09-21 20:25:00",
+    "2022-09-21 20:25:00", "2022-09-21 20:50:00", "2022-09-21 20:50:00"])";
+  const char* times_round = R"([
+    "2022-09-21 18:10:00", "2022-09-21 18:10:00", "2022-09-21 18:10:00",
+    "2022-09-21 18:10:00", "2022-09-21 18:10:00", "2022-09-21 18:10:00",
+    "2022-09-21 18:45:00", "2022-09-21 18:45:00", "2022-09-21 18:45:00",
+    "2022-09-21 19:10:00", "2022-09-21 19:10:00", "2022-09-21 19:10:00",
+    "2022-09-21 19:35:00", "2022-09-21 19:35:00", "2022-09-21 19:35:00",
+    "2022-09-21 19:35:00", "2022-09-21 19:35:00", "2022-09-21 19:35:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:00:00", "2022-09-21 20:00:00",
+    "2022-09-21 20:25:00", "2022-09-21 20:25:00", "2022-09-21 20:25:00",
+    "2022-09-21 20:50:00", "2022-09-21 20:50:00", "2022-09-21 20:50:00"])";
+
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
+  CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
+}
+
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous2) {
+  // Europe/Brussels switches from UTC+2:00 to UTC+1:00 on 2022-10-28 03:00:00 UTC+2:00
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::NANO, "Europe/Brussels");
+  auto options = RoundTemporalOptions(25, CalendarUnit::MINUTE);
+  const char* complete_times = R"([
+    "2018-10-27 23:05:00", "2018-10-27 23:06:00", "2018-10-27 23:07:00", "2018-10-27 23:08:00",
+    "2018-10-27 23:09:00", "2018-10-27 23:10:00", "2018-10-27 23:11:00", "2018-10-27 23:12:00",
+    "2018-10-27 23:13:00", "2018-10-27 23:14:00", "2018-10-27 23:15:00", "2018-10-27 23:16:00",
+    "2018-10-27 23:17:00", "2018-10-27 23:18:00", "2018-10-27 23:19:00", "2018-10-27 23:20:00",
+    "2018-10-27 23:21:00", "2018-10-27 23:22:00", "2018-10-27 23:23:00", "2018-10-27 23:24:00",
+    "2018-10-27 23:25:00", "2018-10-27 23:26:00", "2018-10-27 23:27:00", "2018-10-27 23:28:00",
+    "2018-10-27 23:29:00", "2018-10-27 23:30:00", "2018-10-27 23:31:00", "2018-10-27 23:32:00",
+    "2018-10-27 23:33:00", "2018-10-27 23:34:00", "2018-10-27 23:35:00", "2018-10-27 23:36:00",
+    "2018-10-27 23:37:00", "2018-10-27 23:38:00", "2018-10-27 23:39:00", "2018-10-27 23:40:00",
+    "2018-10-27 23:41:00", "2018-10-27 23:42:00", "2018-10-27 23:43:00", "2018-10-27 23:44:00",
+    "2018-10-27 23:45:00", "2018-10-27 23:46:00", "2018-10-27 23:47:00", "2018-10-27 23:48:00",
+    "2018-10-27 23:49:00", "2018-10-27 23:50:00", "2018-10-27 23:51:00", "2018-10-27 23:52:00",
+    "2018-10-27 23:53:00", "2018-10-27 23:54:00", "2018-10-27 23:55:00", "2018-10-27 23:56:00",
+    "2018-10-27 23:57:00", "2018-10-27 23:58:00", "2018-10-27 23:59:00", "2018-10-28 00:00:00",
+    "2018-10-28 00:01:00", "2018-10-28 00:02:00", "2018-10-28 00:03:00", "2018-10-28 00:04:00",
+    "2018-10-28 00:05:00", "2018-10-28 00:06:00", "2018-10-28 00:07:00", "2018-10-28 00:08:00",
+    "2018-10-28 00:09:00", "2018-10-28 00:10:00", "2018-10-28 00:11:00", "2018-10-28 00:12:00",
+    "2018-10-28 00:13:00", "2018-10-28 00:14:00", "2018-10-28 00:15:00", "2018-10-28 00:16:00",
+    "2018-10-28 00:17:00", "2018-10-28 00:18:00", "2018-10-28 00:19:00", "2018-10-28 00:20:00",
+    "2018-10-28 00:21:00", "2018-10-28 00:22:00", "2018-10-28 00:23:00", "2018-10-28 00:24:00",
+    "2018-10-28 00:25:00", "2018-10-28 00:26:00", "2018-10-28 00:27:00", "2018-10-28 00:28:00",
+    "2018-10-28 00:29:00", "2018-10-28 00:30:00", "2018-10-28 00:31:00", "2018-10-28 00:32:00",
+    "2018-10-28 00:33:00", "2018-10-28 00:34:00", "2018-10-28 00:35:00", "2018-10-28 00:36:00",
+    "2018-10-28 00:37:00", "2018-10-28 00:38:00", "2018-10-28 00:39:00", "2018-10-28 00:40:00",
+    "2018-10-28 00:41:00", "2018-10-28 00:42:00", "2018-10-28 00:43:00", "2018-10-28 00:44:00",
+    "2018-10-28 00:45:00", "2018-10-28 00:46:00", "2018-10-28 00:47:00", "2018-10-28 00:48:00",
+    "2018-10-28 00:49:00", "2018-10-28 00:50:00", "2018-10-28 00:51:00", "2018-10-28 00:52:00",
+    "2018-10-28 00:53:00", "2018-10-28 00:54:00", "2018-10-28 00:55:00", "2018-10-28 00:56:00",
+    "2018-10-28 00:57:00", "2018-10-28 00:58:00", "2018-10-28 00:59:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:01:00", "2018-10-28 01:02:00", "2018-10-28 01:03:00", "2018-10-28 01:04:00",
+    "2018-10-28 01:05:00", "2018-10-28 01:06:00", "2018-10-28 01:07:00", "2018-10-28 01:08:00",
+    "2018-10-28 01:09:00", "2018-10-28 01:10:00", "2018-10-28 01:11:00", "2018-10-28 01:12:00",
+    "2018-10-28 01:13:00", "2018-10-28 01:14:00", "2018-10-28 01:15:00", "2018-10-28 01:16:00",
+    "2018-10-28 01:17:00", "2018-10-28 01:18:00", "2018-10-28 01:19:00", "2018-10-28 01:20:00",
+    "2018-10-28 01:21:00", "2018-10-28 01:22:00", "2018-10-28 01:23:00", "2018-10-28 01:24:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:26:00", "2018-10-28 01:27:00", "2018-10-28 01:28:00",
+    "2018-10-28 01:29:00", "2018-10-28 01:30:00", "2018-10-28 01:31:00", "2018-10-28 01:32:00",
+    "2018-10-28 01:33:00", "2018-10-28 01:34:00", "2018-10-28 01:35:00", "2018-10-28 01:36:00",
+    "2018-10-28 01:37:00", "2018-10-28 01:38:00", "2018-10-28 01:39:00", "2018-10-28 01:40:00",
+    "2018-10-28 01:41:00", "2018-10-28 01:42:00", "2018-10-28 01:43:00", "2018-10-28 01:44:00",
+    "2018-10-28 01:45:00", "2018-10-28 01:46:00", "2018-10-28 01:47:00", "2018-10-28 01:48:00",
+    "2018-10-28 01:49:00", "2018-10-28 01:50:00", "2018-10-28 01:51:00", "2018-10-28 01:52:00",
+    "2018-10-28 01:53:00", "2018-10-28 01:54:00", "2018-10-28 01:55:00", "2018-10-28 01:56:00",
+    "2018-10-28 01:57:00", "2018-10-28 01:58:00", "2018-10-28 01:59:00", "2018-10-28 02:00:00",
+    "2018-10-28 02:01:00", "2018-10-28 02:02:00", "2018-10-28 02:03:00", "2018-10-28 02:04:00",
+    "2018-10-28 02:05:00", "2018-10-28 02:06:00", "2018-10-28 02:07:00", "2018-10-28 02:08:00",
+    "2018-10-28 02:09:00", "2018-10-28 02:10:00", "2018-10-28 02:11:00", "2018-10-28 02:12:00",
+    "2018-10-28 02:13:00", "2018-10-28 02:14:00", "2018-10-28 02:15:00", "2018-10-28 02:16:00"])";
+
+  const char* complete_times_floor = R"([
+    "2018-10-27 22:45:00", "2018-10-27 22:45:00", "2018-10-27 22:45:00", "2018-10-27 22:45:00",
+    "2018-10-27 22:45:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 02:15:00", "2018-10-28 02:15:00"])";
+
+  const char* times = R"([
+    "2018-10-27 22:44:00", "2018-10-27 22:45:00", "2018-10-27 22:46:00",
+    "2018-10-27 23:09:00", "2018-10-27 23:10:00", "2018-10-27 23:11:00",
+    "2018-10-27 23:34:00", "2018-10-27 23:35:00", "2018-10-27 23:36:00",
+    "2018-10-27 23:44:00", "2018-10-27 23:45:00", "2018-10-27 23:46:00",
+    "2018-10-27 23:46:00", "2018-10-28 00:00:00", "2018-10-28 00:09:00",
+    "2018-10-28 00:09:00", "2018-10-28 00:10:00", "2018-10-28 00:11:00",
+    "2018-10-28 00:34:00", "2018-10-28 00:35:00", "2018-10-28 00:36:00",
+    "2018-10-28 00:59:00", "2018-10-28 01:00:00", "2018-10-28 01:01:00",
+    "2018-10-28 01:24:00", "2018-10-28 01:25:00", "2018-10-28 01:26:00",
+    "2018-10-28 01:49:00", "2018-10-28 01:50:00", "2018-10-28 01:51:00",
+    "2018-10-28 02:14:00", "2018-10-28 02:15:00", "2018-10-28 02:16:00"])";
+  const char* times_ceil = R"([
+    "2018-10-27 22:45:00", "2018-10-27 22:45:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 02:15:00",
+    "2018-10-28 02:15:00", "2018-10-28 02:15:00", "2018-10-28 02:40:00"])";
+  const char* times_floor = R"([
+    "2018-10-27 22:20:00", "2018-10-27 22:45:00", "2018-10-27 22:45:00",
+    "2018-10-27 22:45:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 02:15:00", "2018-10-28 02:15:00"])";
+  const char* times_round = R"([
+    "2018-10-27 22:45:00", "2018-10-27 22:45:00", "2018-10-27 22:45:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 02:15:00", "2018-10-28 02:15:00", "2018-10-28 02:15:00"])";
+
+  CheckScalarUnary("floor_temporal", unit, complete_times, unit, complete_times_floor,
+                   &options);
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
+  CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
+}
+
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalNonexistent1) {
+  // Asia/Tehran switches from UTC+3:30 to UTC+4:30 on 2022-03-22 00:00:00 UTC+3:30
+  // This causes an hour long non-existing period in local time.
+  auto unit = timestamp(TimeUnit::SECOND, "Asia/Tehran");
+  auto options = RoundTemporalOptions(16, CalendarUnit::MINUTE);
+  const char* times = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:00:00", "2022-03-21 20:30:00"])";
+  const char* times_ceil = R"([
+    "2022-03-21 19:42:00", "2022-03-21 20:14:00", "2022-03-21 20:34:00"])";
+  const char* times_floor = R"([
+    "2022-03-21 19:26:00", "2022-03-21 19:58:00", "2022-03-21 20:28:00"])";

Review Comment:
   The other values look ok, but `20:28` is `23:58` in local time, which does not seem to be a multiple of 16...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] ursabot commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
ursabot commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1126596142

   Benchmark runs are scheduled for baseline = 7a955f07b3472a36d9174eb71883f8f9c33083ae and contender = dbf11f8b4520d94afa2d46c8a3b5d01192716dbf. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Skipped :warning: Only ['lang', 'name'] filters are supported on ec2-t3-xlarge-us-east-2] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/53ace4fa7e464725adfa1de0ff7a72cd...8853feeec7ce4eee91ab66b0365b1e46/)
   [Skipped :warning: Only ['lang', 'name'] filters are supported on test-mac-arm] [test-mac-arm](https://conbench.ursa.dev/compare/runs/3deada4f13044a2d9f1a3680c80fcbec...b2b6c671a10b4656956dc9619d68a6ad/)
   [Skipped :warning: Only ['lang', 'name'] filters are supported on ursa-i9-9960x] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/29bfe49c406545ae93205d6c68d8ce1a...be7a6e3a6fce4354a86fe8f61939789f/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/1ee5f735665e456f8f2de04a87ed6117...a10f89b8538e4aa38ac66487b31415bd/)
   Buildkite builds:
   [Scheduled] [`dbf11f8b` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/762)
   [Finished] [`7a955f07` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/747)
   [Failed] [`7a955f07` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/744)
   [Scheduled] [`7a955f07` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/734)
   [Finished] [`7a955f07` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/749)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r865042558


##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,107 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(1, CalendarUnit::HOUR, /*some_parameter=*/true);
+  // Asia/Tehran switched from UTC+X to UTC+Y on 2022-03-31 HH:mm:ss

Review Comment:
   So far I have something like this:
   ```
     // Asia/Tehran switches from UTC+3:30 to UTC+4:30 on 2022-03-22 00:00:00 UTC+3:30
     // This causes an hour long non-existing period in local time.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r867155626


##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,114 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(1, CalendarUnit::HOUR, /*some_parameter=*/true);

Review Comment:
   Ugh. Removed - for the record it was `week_starts_monday`. I love stating tautologies.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r867154702


##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,114 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
+  // This causes an hour long ambiguous period in local time.

Review Comment:
   Removed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jorisvandenbossche commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r875939346


##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2854,17 +2854,66 @@ TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalNonexistent2) {
       R"(["2015-03-29 00:52:00", "2015-03-29 01:12:00", "2015-03-29 01:12:00",
           "2015-03-29 01:12:00", "2015-03-29 01:12:00", "2015-03-29 01:12:00"])";
   const char* times_floor =
-      R"(["2015-03-29 00:52:00", "2015-03-29 00:52:00", "2015-03-29 00:52:00",
-          "2015-03-29 00:52:00", "2015-03-29 00:52:00", "2015-03-29 01:12:00"])";
+      R"(["2015-03-29 00:52:00", "2015-03-29 00:56:00", "2015-03-29 00:56:00",
+          "2015-03-29 00:56:00", "2015-03-29 00:56:00", "2015-03-29 01:12:00"])";
   const char* times_round =
-      R"(["2015-03-29 00:52:00", "2015-03-29 01:08:00", "2015-03-29 01:12:00",
+      R"(["2015-03-29 00:52:00", "2015-03-29 00:56:00", "2015-03-29 01:12:00",
           "2015-03-29 01:12:00", "2015-03-29 01:12:00", "2015-03-29 01:12:00"])";
 
   CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
   CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
   CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalDSTJump) {
+  // Europe/Brussels switches from UTC+2:00 to UTC+1:00 on 2015-10-29 03:00:00 UTC+2:00
+  // This causes an hour long ambiguous period in local time.
+  // Europe/Brussels switches from UTC+1:00 to UTC+2:00 on 2018-03-28 02:00:00 UTC+1:00
+  // This causes an hour long non-existing period in local time.

Review Comment:
   ```suggestion
     // Europe/Brussels switches from UTC+2:00 to UTC+1:00 on 2018-10-29 03:00:00 UTC+2:00
     // This causes an hour long ambiguous period in local time.
     // Europe/Brussels switches from UTC+1:00 to UTC+2:00 on 2015-03-28 02:00:00 UTC+1:00
     // This causes an hour long non-existing period in local time.
   ```
   
   ? (in any case it doesn't match with what is in the test below)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r851496389


##########
cpp/src/arrow/compute/kernels/temporal_internal.h:
##########
@@ -101,7 +101,10 @@ struct NonZonedLocalizer {
   }
 
   template <typename Duration>
-  Duration ConvertLocalToSys(Duration t, Status* st) const {
+  Duration ConvertLocalToSys(
+      Duration t, Status* st,
+      const AmbiguousTime ambiguous = AmbiguousTime::AMBIGUOUS_RAISE,
+      const NonexistentTime nonexistent_time = NonexistentTime::NONEXISTENT_RAISE) const {
     return t;
   }
 

Review Comment:
   ```suggestion
     template <typename Duration, typename Unit>
     Duration FloorTime(int64_t t, const RoundTemporalOptions* options) const {
       const Unit d = floor<Unit>(sys_time<Duration>(Duration{t})).time_since_epoch();
   
       if (options->multiple == 1) {
         return duration_cast<Duration>(d);
       } else {
         const Unit unit = Unit{options->multiple};
         const Unit m = (d.count() >= 0) ? d / unit * unit : (d - unit + Unit{1}) / unit * unit;
         return duration_cast<Duration>(m);
       }
     }
   
     template <typename Duration, typename Unit>
     Duration CeilTime(int64_t t, const RoundTemporalOptions* options) const {
       const Duration d = FloorTime<Duration, Unit>(t, options);
       if (d.count() < t) {
         return d + duration_cast<Duration>(Unit{options->multiple});
       }
       return d;
     }
   
     template <typename Duration, typename Unit>
     Duration RoundTime(const int64_t t, const RoundTemporalOptions* options) const {
       const Duration f = FloorTime<Duration, Unit>(t, options);
       Duration c = f;
       if (f.count() < t) {
         c += duration_cast<Duration>(Unit{options->multiple});
       }
       return (t - f.count() >= c.count() - t) ? c : f;
     }
   ```



##########
cpp/src/arrow/compute/kernels/temporal_internal.h:
##########
@@ -120,18 +123,59 @@ struct ZonedLocalizer {
   }
 
   template <typename Duration>
-  Duration ConvertLocalToSys(Duration t, Status* st) const {
+  Duration get_local_time(Duration arg) const {
+    return zoned_time<Duration>(tz, local_time<Duration>(arg))
+        .get_sys_time()
+        .time_since_epoch();
+  }
+
+  template <typename Duration>
+  Duration get_local_time(Duration arg, const arrow_vendored::date::choose choose) const {
+    return zoned_time<Duration>(tz, local_time<Duration>(arg), choose)
+        .get_sys_time()
+        .time_since_epoch();
+  }
+
+  template <typename Duration>
+  Duration ConvertLocalToSys(
+      Duration t, Status* st,
+      const AmbiguousTime ambiguous = AmbiguousTime::AMBIGUOUS_RAISE,
+      const NonexistentTime nonexistent_time = NonexistentTime::NONEXISTENT_RAISE) const {
     try {
       return zoned_time<Duration>{tz, local_time<Duration>(t)}
           .get_sys_time()
           .time_since_epoch();
     } catch (const arrow_vendored::date::nonexistent_local_time& e) {
-      *st = Status::Invalid("Local time does not exist: ", e.what());
-      return Duration{0};
+      switch (nonexistent_time) {
+        case NonexistentTime::NONEXISTENT_RAISE: {
+          *st = Status::Invalid("Timestamp doesn't exist in timezone '", tz,
+                                "': ", e.what());
+          return t;
+        }
+        case NonexistentTime::NONEXISTENT_EARLIEST: {
+          return get_local_time<Duration>(t, arrow_vendored::date::choose::latest) -
+                 Duration{1};
+        }
+        case NonexistentTime::NONEXISTENT_LATEST: {
+          return get_local_time<Duration>(t, arrow_vendored::date::choose::latest);
+        }
+      }
     } catch (const arrow_vendored::date::ambiguous_local_time& e) {
-      *st = Status::Invalid("Local time is ambiguous: ", e.what());
-      return Duration{0};
+      switch (ambiguous) {
+        case AmbiguousTime::AMBIGUOUS_RAISE: {
+          *st = Status::Invalid("Timestamp is ambiguous in timezone '", tz,
+                                "': ", e.what());
+          return t;
+        }
+        case AmbiguousTime::AMBIGUOUS_EARLIEST: {
+          return get_local_time<Duration>(t, arrow_vendored::date::choose::earliest);
+        }
+        case AmbiguousTime::AMBIGUOUS_LATEST: {
+          return get_local_time<Duration>(t, arrow_vendored::date::choose::latest);
+        }
+      }
     }
+    return Duration{0};
   }
 

Review Comment:
   ```suggestion
     template <typename Duration, typename Unit>
     Duration FloorTime(const int64_t t, const RoundTemporalOptions* options) const {
       const sys_time<Duration> st = sys_time<Duration>{Duration{t}};
       const std::chrono::seconds offset = tz->get_info(tz->to_local(st)).first.offset;
       const Unit d = floor<Unit>(st - offset).time_since_epoch();
   
       if (options->multiple == 1) {
         return duration_cast<Duration>(d + offset);
       } else {
         const Unit unit = Unit{options->multiple};
         const Unit m = (d.count() >= 0) ? d / unit * unit : (d - unit + Unit{1}) / unit * unit;
         return duration_cast<Duration>(m + offset);
       }
     }
   
     template <typename Duration, typename Unit>
     Duration CeilTime(const int64_t t, const RoundTemporalOptions* options) const {
       const Duration d = FloorTime<Duration, Unit>(t, options);
       if (d.count() < t) {
         return d + duration_cast<Duration>(Unit{options->multiple});
       }
       return d;
     }
   
     template <typename Duration, typename Unit>
     Duration RoundTime(const int64_t t, const RoundTemporalOptions* options) const {
       const Duration f = FloorTime<Duration, Unit>(t, options);
       Duration c = f;
       if (f.count() < t) {
         c += duration_cast<Duration>(Unit{options->multiple});
       }
       return (t - f.count() >= c.count() - t) ? c : f;
     }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1113207341

   @pitrou I think this is ready for another review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r875966801


##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2854,17 +2854,66 @@ TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalNonexistent2) {
       R"(["2015-03-29 00:52:00", "2015-03-29 01:12:00", "2015-03-29 01:12:00",
           "2015-03-29 01:12:00", "2015-03-29 01:12:00", "2015-03-29 01:12:00"])";
   const char* times_floor =
-      R"(["2015-03-29 00:52:00", "2015-03-29 00:52:00", "2015-03-29 00:52:00",
-          "2015-03-29 00:52:00", "2015-03-29 00:52:00", "2015-03-29 01:12:00"])";
+      R"(["2015-03-29 00:52:00", "2015-03-29 00:56:00", "2015-03-29 00:56:00",
+          "2015-03-29 00:56:00", "2015-03-29 00:56:00", "2015-03-29 01:12:00"])";
   const char* times_round =
-      R"(["2015-03-29 00:52:00", "2015-03-29 01:08:00", "2015-03-29 01:12:00",
+      R"(["2015-03-29 00:52:00", "2015-03-29 00:56:00", "2015-03-29 01:12:00",
           "2015-03-29 01:12:00", "2015-03-29 01:12:00", "2015-03-29 01:12:00"])";
 
   CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
   CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
   CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalDSTJump) {
+  // Europe/Brussels switches from UTC+2:00 to UTC+1:00 on 2015-10-29 03:00:00 UTC+2:00
+  // This causes an hour long ambiguous period in local time.
+  // Europe/Brussels switches from UTC+1:00 to UTC+2:00 on 2018-03-28 02:00:00 UTC+1:00
+  // This causes an hour long non-existing period in local time.
+  auto unit = timestamp(TimeUnit::SECOND, "Europe/Brussels");
+  auto options = RoundTemporalOptions(256, CalendarUnit::MINUTE);
+  const char* times =
+      R"(["2015-03-28 21:31:00", "2015-03-28 23:32:00", "2015-03-28 23:33:00",
+          "2015-03-28 23:53:00", "2015-03-29 01:08:00", "2015-03-29 01:28:00",
+          "2015-03-29 01:32:00", "2015-03-29 01:51:00", "2015-03-29 02:12:00",
+          "2015-03-29 02:44:00", "2015-03-29 02:59:00", "2015-03-29 03:02:00",
+          "2015-03-29 03:08:00", "2015-03-29 03:26:00", "2015-03-29 04:59:00",
+          "2018-10-27 20:44:00", "2018-10-27 21:45:00", "2018-10-27 22:46:00",
+          "2018-10-27 23:09:00", "2018-10-27 23:10:00", "2018-10-27 23:11:00",
+          "2018-10-28 03:14:00", "2018-10-28 04:15:00", "2018-10-28 05:16:00"])";

Review Comment:
   Yeah agreed. I'll look into reducing these.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r873713195


##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,309 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(25, CalendarUnit::MINUTE);
+  const char* times = R"([
+    "2022-09-21 18:09:00", "2022-09-21 18:10:00", "2022-09-21 18:11:00",
+    "2022-09-21 18:19:00", "2022-09-21 18:20:00", "2022-09-21 18:21:00",
+    "2022-09-21 18:44:00", "2022-09-21 18:45:00", "2022-09-21 18:46:00",
+    "2022-09-21 19:09:00", "2022-09-21 19:10:00", "2022-09-21 19:11:00",
+    "2022-09-21 19:24:00", "2022-09-21 19:25:00", "2022-09-21 19:26:00",
+    "2022-09-21 19:34:00", "2022-09-21 19:35:00", "2022-09-21 19:36:00",
+    "2022-09-21 19:59:00", "2022-09-21 20:00:00", "2022-09-21 20:01:00",
+    "2022-09-21 20:24:00", "2022-09-21 20:25:00", "2022-09-21 20:26:00",
+    "2022-09-21 20:49:00", "2022-09-21 20:50:00", "2022-09-21 20:51:00"])";
+  const char* times_ceil = R"([
+    "2022-09-21 18:10:00", "2022-09-21 18:10:00", "2022-09-21 18:20:00",
+    "2022-09-21 18:20:00", "2022-09-21 18:20:00", "2022-09-21 18:45:00",
+    "2022-09-21 18:45:00", "2022-09-21 18:45:00", "2022-09-21 19:10:00",
+    "2022-09-21 19:10:00", "2022-09-21 19:10:00", "2022-09-21 19:35:00",
+    "2022-09-21 19:35:00", "2022-09-21 19:35:00", "2022-09-21 19:35:00",
+    "2022-09-21 19:35:00", "2022-09-21 19:35:00", "2022-09-21 20:00:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:00:00", "2022-09-21 20:25:00",
+    "2022-09-21 20:25:00", "2022-09-21 20:25:00", "2022-09-21 20:50:00",
+    "2022-09-21 20:50:00", "2022-09-21 20:50:00", "2022-09-21 21:15:00"])";
+  const char* times_floor = R"([
+    "2022-09-21 17:45:00", "2022-09-21 18:10:00", "2022-09-21 18:10:00",
+    "2022-09-21 18:10:00", "2022-09-21 18:10:00", "2022-09-21 18:10:00",
+    "2022-09-21 18:20:00", "2022-09-21 18:45:00", "2022-09-21 18:45:00",
+    "2022-09-21 18:45:00", "2022-09-21 19:10:00", "2022-09-21 19:10:00",
+    "2022-09-21 19:10:00", "2022-09-21 19:10:00", "2022-09-21 19:10:00",
+    "2022-09-21 19:10:00", "2022-09-21 19:35:00", "2022-09-21 19:35:00",
+    "2022-09-21 19:35:00", "2022-09-21 20:00:00", "2022-09-21 20:00:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:25:00", "2022-09-21 20:25:00",
+    "2022-09-21 20:25:00", "2022-09-21 20:50:00", "2022-09-21 20:50:00"])";
+  const char* times_round = R"([
+    "2022-09-21 18:10:00", "2022-09-21 18:10:00", "2022-09-21 18:10:00",
+    "2022-09-21 18:20:00", "2022-09-21 18:20:00", "2022-09-21 18:10:00",
+    "2022-09-21 18:45:00", "2022-09-21 18:45:00", "2022-09-21 18:45:00",
+    "2022-09-21 19:10:00", "2022-09-21 19:10:00", "2022-09-21 19:10:00",
+    "2022-09-21 19:35:00", "2022-09-21 19:35:00", "2022-09-21 19:35:00",
+    "2022-09-21 19:35:00", "2022-09-21 19:35:00", "2022-09-21 19:35:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:00:00", "2022-09-21 20:00:00",
+    "2022-09-21 20:25:00", "2022-09-21 20:25:00", "2022-09-21 20:25:00",
+    "2022-09-21 20:50:00", "2022-09-21 20:50:00", "2022-09-21 20:50:00"])";
+
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
+  CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
+}
+
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous2) {
+  // Europe/Brussels switches from UTC+2:00 to UTC+1:00 on 2018-10-28 03:00:00 UTC+2:00
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::NANO, "Europe/Brussels");
+  auto options = RoundTemporalOptions(25, CalendarUnit::MINUTE);
+  const char* complete_times = R"([
+    "2018-10-27 23:05:00", "2018-10-27 23:06:00", "2018-10-27 23:07:00", "2018-10-27 23:08:00",
+    "2018-10-27 23:09:00", "2018-10-27 23:10:00", "2018-10-27 23:11:00", "2018-10-27 23:12:00",
+    "2018-10-27 23:13:00", "2018-10-27 23:14:00", "2018-10-27 23:15:00", "2018-10-27 23:16:00",
+    "2018-10-27 23:17:00", "2018-10-27 23:18:00", "2018-10-27 23:19:00", "2018-10-27 23:20:00",
+    "2018-10-27 23:21:00", "2018-10-27 23:22:00", "2018-10-27 23:23:00", "2018-10-27 23:24:00",
+    "2018-10-27 23:25:00", "2018-10-27 23:26:00", "2018-10-27 23:27:00", "2018-10-27 23:28:00",
+    "2018-10-27 23:29:00", "2018-10-27 23:30:00", "2018-10-27 23:31:00", "2018-10-27 23:32:00",
+    "2018-10-27 23:33:00", "2018-10-27 23:34:00", "2018-10-27 23:35:00", "2018-10-27 23:36:00",
+    "2018-10-27 23:37:00", "2018-10-27 23:38:00", "2018-10-27 23:39:00", "2018-10-27 23:40:00",
+    "2018-10-27 23:41:00", "2018-10-27 23:42:00", "2018-10-27 23:43:00", "2018-10-27 23:44:00",
+    "2018-10-27 23:45:00", "2018-10-27 23:46:00", "2018-10-27 23:47:00", "2018-10-27 23:48:00",
+    "2018-10-27 23:49:00", "2018-10-27 23:50:00", "2018-10-27 23:51:00", "2018-10-27 23:52:00",
+    "2018-10-27 23:53:00", "2018-10-27 23:54:00", "2018-10-27 23:55:00", "2018-10-27 23:56:00",
+    "2018-10-27 23:57:00", "2018-10-27 23:58:00", "2018-10-27 23:59:00", "2018-10-28 00:00:00",
+    "2018-10-28 00:01:00", "2018-10-28 00:02:00", "2018-10-28 00:03:00", "2018-10-28 00:04:00",
+    "2018-10-28 00:05:00", "2018-10-28 00:06:00", "2018-10-28 00:07:00", "2018-10-28 00:08:00",
+    "2018-10-28 00:09:00", "2018-10-28 00:10:00", "2018-10-28 00:11:00", "2018-10-28 00:12:00",
+    "2018-10-28 00:13:00", "2018-10-28 00:14:00", "2018-10-28 00:15:00", "2018-10-28 00:16:00",
+    "2018-10-28 00:17:00", "2018-10-28 00:18:00", "2018-10-28 00:19:00", "2018-10-28 00:20:00",
+    "2018-10-28 00:21:00", "2018-10-28 00:22:00", "2018-10-28 00:23:00", "2018-10-28 00:24:00",
+    "2018-10-28 00:25:00", "2018-10-28 00:26:00", "2018-10-28 00:27:00", "2018-10-28 00:28:00",
+    "2018-10-28 00:29:00", "2018-10-28 00:30:00", "2018-10-28 00:31:00", "2018-10-28 00:32:00",
+    "2018-10-28 00:33:00", "2018-10-28 00:34:00", "2018-10-28 00:35:00", "2018-10-28 00:36:00",
+    "2018-10-28 00:37:00", "2018-10-28 00:38:00", "2018-10-28 00:39:00", "2018-10-28 00:40:00",
+    "2018-10-28 00:41:00", "2018-10-28 00:42:00", "2018-10-28 00:43:00", "2018-10-28 00:44:00",
+    "2018-10-28 00:45:00", "2018-10-28 00:46:00", "2018-10-28 00:47:00", "2018-10-28 00:48:00",
+    "2018-10-28 00:49:00", "2018-10-28 00:50:00", "2018-10-28 00:51:00", "2018-10-28 00:52:00",
+    "2018-10-28 00:53:00", "2018-10-28 00:54:00", "2018-10-28 00:55:00", "2018-10-28 00:56:00",
+    "2018-10-28 00:57:00", "2018-10-28 00:58:00", "2018-10-28 00:59:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:01:00", "2018-10-28 01:02:00", "2018-10-28 01:03:00", "2018-10-28 01:04:00",
+    "2018-10-28 01:05:00", "2018-10-28 01:06:00", "2018-10-28 01:07:00", "2018-10-28 01:08:00",
+    "2018-10-28 01:09:00", "2018-10-28 01:10:00", "2018-10-28 01:11:00", "2018-10-28 01:12:00",
+    "2018-10-28 01:13:00", "2018-10-28 01:14:00", "2018-10-28 01:15:00", "2018-10-28 01:16:00",
+    "2018-10-28 01:17:00", "2018-10-28 01:18:00", "2018-10-28 01:19:00", "2018-10-28 01:20:00",
+    "2018-10-28 01:21:00", "2018-10-28 01:22:00", "2018-10-28 01:23:00", "2018-10-28 01:24:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:26:00", "2018-10-28 01:27:00", "2018-10-28 01:28:00",
+    "2018-10-28 01:29:00", "2018-10-28 01:30:00", "2018-10-28 01:31:00", "2018-10-28 01:32:00",
+    "2018-10-28 01:33:00", "2018-10-28 01:34:00", "2018-10-28 01:35:00", "2018-10-28 01:36:00",
+    "2018-10-28 01:37:00", "2018-10-28 01:38:00", "2018-10-28 01:39:00", "2018-10-28 01:40:00",
+    "2018-10-28 01:41:00", "2018-10-28 01:42:00", "2018-10-28 01:43:00", "2018-10-28 01:44:00",
+    "2018-10-28 01:45:00", "2018-10-28 01:46:00", "2018-10-28 01:47:00", "2018-10-28 01:48:00",
+    "2018-10-28 01:49:00", "2018-10-28 01:50:00", "2018-10-28 01:51:00", "2018-10-28 01:52:00",
+    "2018-10-28 01:53:00", "2018-10-28 01:54:00", "2018-10-28 01:55:00", "2018-10-28 01:56:00",
+    "2018-10-28 01:57:00", "2018-10-28 01:58:00", "2018-10-28 01:59:00", "2018-10-28 02:00:00",
+    "2018-10-28 02:01:00", "2018-10-28 02:02:00", "2018-10-28 02:03:00", "2018-10-28 02:04:00",
+    "2018-10-28 02:05:00", "2018-10-28 02:06:00", "2018-10-28 02:07:00", "2018-10-28 02:08:00",
+    "2018-10-28 02:09:00", "2018-10-28 02:10:00", "2018-10-28 02:11:00", "2018-10-28 02:12:00",
+    "2018-10-28 02:13:00", "2018-10-28 02:14:00", "2018-10-28 02:15:00", "2018-10-28 02:16:00"])";
+
+  const char* complete_times_floor = R"([
+    "2018-10-27 22:45:00", "2018-10-27 22:45:00", "2018-10-27 22:45:00", "2018-10-27 22:45:00",
+    "2018-10-27 22:45:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 02:15:00", "2018-10-28 02:15:00"])";
+
+  const char* times = R"([
+    "2018-10-27 22:44:00", "2018-10-27 22:45:00", "2018-10-27 22:46:00",
+    "2018-10-27 23:09:00", "2018-10-27 23:10:00", "2018-10-27 23:11:00",
+    "2018-10-27 23:34:00", "2018-10-27 23:35:00", "2018-10-27 23:36:00",
+    "2018-10-27 23:44:00", "2018-10-27 23:45:00", "2018-10-27 23:46:00",
+    "2018-10-27 23:46:00", "2018-10-28 00:00:00", "2018-10-28 00:09:00",
+    "2018-10-28 00:09:00", "2018-10-28 00:10:00", "2018-10-28 00:11:00",
+    "2018-10-28 00:34:00", "2018-10-28 00:35:00", "2018-10-28 00:36:00",
+    "2018-10-28 00:59:00", "2018-10-28 01:00:00", "2018-10-28 01:01:00",
+    "2018-10-28 01:24:00", "2018-10-28 01:25:00", "2018-10-28 01:26:00",
+    "2018-10-28 01:49:00", "2018-10-28 01:50:00", "2018-10-28 01:51:00",
+    "2018-10-28 02:14:00", "2018-10-28 02:15:00", "2018-10-28 02:16:00"])";
+  const char* times_ceil = R"([
+    "2018-10-27 22:45:00", "2018-10-27 22:45:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 02:15:00",
+    "2018-10-28 02:15:00", "2018-10-28 02:15:00", "2018-10-28 02:40:00"])";
+  const char* times_floor = R"([
+    "2018-10-27 22:20:00", "2018-10-27 22:45:00", "2018-10-27 22:45:00",
+    "2018-10-27 22:45:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 02:15:00", "2018-10-28 02:15:00"])";
+  const char* times_round = R"([
+    "2018-10-27 22:45:00", "2018-10-27 22:45:00", "2018-10-27 22:45:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 02:15:00", "2018-10-28 02:15:00", "2018-10-28 02:15:00"])";
+
+  CheckScalarUnary("floor_temporal", unit, complete_times, unit, complete_times_floor,
+                   &options);
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
+  CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
+}
+
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalNonexistent1) {
+  // Asia/Tehran switches from UTC+3:30 to UTC+4:30 on 2022-03-22 00:00:00 UTC+3:30
+  // This causes an hour long non-existing period in local time.
+  auto unit = timestamp(TimeUnit::SECOND, "Asia/Tehran");
+  auto options = RoundTemporalOptions(16, CalendarUnit::MINUTE);
+  const char* times = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:00:00", "2022-03-21 20:30:00"])";
+  const char* times_ceil = R"([
+    "2022-03-21 19:42:00", "2022-03-21 20:14:00", "2022-03-21 20:34:00"])";
+  const char* times_floor = R"([
+    "2022-03-21 19:26:00", "2022-03-21 19:58:00", "2022-03-21 20:18:00"])";
+  const char* times_round = R"([
+    "2022-03-21 19:26:00", "2022-03-21 19:58:00", "2022-03-21 20:34:00"])";
+
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
+  CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
+}
+
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalNonexistent2) {
+  // Europe/Brussels switches from UTC+1:00 to UTC+2:00 on 2015-03-29 02:00:00 UTC+1:00
+  // This causes an hour long non-existing period in local time.
+  auto unit = timestamp(TimeUnit::SECOND, "Europe/Brussels");
+  auto options = RoundTemporalOptions(16, CalendarUnit::MINUTE);
+  const char* times =
+      R"(["2015-03-29 00:52:00", "2015-03-29 01:01:00", "2015-03-29 01:05:00",
+          "2015-03-29 01:08:00", "2015-03-29 01:10:00", "2015-03-29 01:12:00"])";
+  const char* times_ceil =
+      R"(["2015-03-29 00:52:00", "2015-03-29 01:12:00", "2015-03-29 01:12:00",
+          "2015-03-29 01:12:00", "2015-03-29 01:12:00", "2015-03-29 01:12:00"])";
+  const char* times_floor =
+      R"(["2015-03-29 00:52:00", "2015-03-29 00:56:00", "2015-03-29 00:56:00",
+          "2015-03-29 00:56:00", "2015-03-29 00:56:00", "2015-03-29 01:12:00"])";

Review Comment:
   Other similar problem here: `2015-03-29 00:56:00` is `2015-03-29 01:56:00` (UTC+1) in local time, while `2015-03-29 01:12:00` is `2015-03-29 03:12:00` in local time (UTC+2). So they are separated by 76 minutes, which is not a multiple of 16.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1127664205

   Ok, it seems we're really shooting in the dark trying to get this right while deciphering the hardcoded C++ test values.
   
   I would suggest writing some conformance tests on the Python side (Pandas will definitely help) that would check the result values satistfy the invariants for the three (floor, ceil, round) functions. Then we can perhaps reliably tune the computation to get the tests to pass.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r867156658


##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,114 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(1, CalendarUnit::HOUR, /*some_parameter=*/true);
+  const char* times = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:00:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:00:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:30:00", "2022-09-21 21:00:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_ceil = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:30:00", "2022-09-21 20:30:00", "2022-09-21 21:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_floor = R"([
+    "2022-03-21 19:30:00", "2022-03-21 19:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 18:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 19:30:00", "2022-09-21 20:30:00", "2022-09-21 20:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_round = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:30:00", "2022-09-21 20:30:00", "2022-09-21 21:30:00",
+    "2022-09-21 21:30:00"])";
+
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
+  CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
+}
+
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous2) {
+  // Europe/Brussels switches from UTC+2:00 to UTC+1:00 on 2022-10-25 03:00:00 UTC+2:00
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::NANO, "Europe/Brussels");
+  auto options = RoundTemporalOptions(15, CalendarUnit::MINUTE, true);
+  const char* times = R"([
+    "2018-10-28 01:05:00", "2018-10-28 01:20:00", "2018-10-28 01:55:00",
+    "2018-10-28 01:59:59", "2018-10-28 02:00:00", "2018-10-28 02:08:00"])";
+  const char* times_ceil = R"([
+    "2018-10-28 01:15:00", "2018-10-28 01:30:00", "2018-10-28 02:00:00",
+    "2018-10-28 02:00:00", "2018-10-28 02:00:00", "2018-10-28 02:15:00"])";
+  const char* times_floor = R"([
+    "2018-10-28 01:00:00", "2018-10-28 01:15:00", "2018-10-28 01:45:00",
+    "2018-10-28 01:45:00", "2018-10-28 02:00:00", "2018-10-28 02:00:00"])";
+  const char* times_round = R"([
+    "2018-10-28 01:00:00", "2018-10-28 01:15:00", "2018-10-28 02:00:00",
+    "2018-10-28 02:00:00", "2018-10-28 02:00:00", "2018-10-28 02:15:00"])";
+
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
+  CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
+}
+
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalNonexistent1) {
+  // Asia/Tehran switches from UTC+3:30 to UTC+4:30 on 2022-03-22 00:00:00 UTC+3:30
+  // This causes an hour long non-existing period in local time.
+  auto unit = timestamp(TimeUnit::SECOND, "Asia/Tehran");
+  auto options = RoundTemporalOptions(1, CalendarUnit::HOUR, true);
+  const char* times = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:00:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:00:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:30:00", "2022-09-21 21:00:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_ceil = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:30:00", "2022-09-21 20:30:00", "2022-09-21 21:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_floor = R"([
+    "2022-03-21 19:30:00", "2022-03-21 19:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 18:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 19:30:00", "2022-09-21 20:30:00", "2022-09-21 20:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_round = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1126137372

   @pitrou [Added a test case.](https://github.com/apache/arrow/pull/12528/commits/dbf11f8b4520d94afa2d46c8a3b5d01192716dbf)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r865137124


##########
cpp/src/arrow/compute/kernels/temporal_internal.h:
##########
@@ -119,19 +152,93 @@ struct ZonedLocalizer {
     return tz->to_local(sys_time<Duration>(Duration{t}));
   }
 
-  template <typename Duration>
-  Duration ConvertLocalToSys(Duration t, Status* st) const {
+  template <typename Duration, typename Unit>
+  Duration FloorTimePoint(const int64_t t, const RoundTemporalOptions* options) const {
+    local_time<Duration> lt = tz->to_local(sys_time<Duration>(Duration{t}));
+    const Unit d = floor<Unit>(lt).time_since_epoch();
+    Unit d2;
+
+    if (options->multiple == 1) {
+      d2 = d;
+    } else {
+      const Unit unit = Unit{options->multiple};
+      d2 = (d.count() >= 0) ? d / unit * unit : (d - unit + Unit{1}) / unit * unit;
+    }
+
     try {
-      return zoned_time<Duration>{tz, local_time<Duration>(t)}
+      return zoned_time<Duration>(tz, local_time<Duration>(duration_cast<Duration>(d2)))
           .get_sys_time()
           .time_since_epoch();
-    } catch (const arrow_vendored::date::nonexistent_local_time& e) {
-      *st = Status::Invalid("Local time does not exist: ", e.what());
-      return Duration{0};
-    } catch (const arrow_vendored::date::ambiguous_local_time& e) {
-      *st = Status::Invalid("Local time is ambiguous: ", e.what());
-      return Duration{0};
+    } catch (const arrow_vendored::date::ambiguous_local_time&) {
+      // In case we hit an ambiguous period we round to a time multiple just prior,
+      // convert to UTC and add the time unit we're rounding to.
+      const arrow_vendored::date::local_info li =
+          tz->get_info(local_time<Duration>(duration_cast<Duration>(d2)));
+
+      const Duration t2 =
+          zoned_time<Duration>(
+              tz, local_time<Duration>(duration_cast<Duration>(d2 - li.second.offset)))
+              .get_sys_time()
+              .time_since_epoch();
+      const Unit unit = Unit{options->multiple};
+      const Unit t3 =
+          (t2.count() >= 0) ? t2 / unit * unit : (t2 - unit + Unit{1}) / unit * unit;
+      const Duration t4 = duration_cast<Duration>(t3 + li.first.offset);
+      if (t < t4.count()) {
+        return duration_cast<Duration>(t3 + li.second.offset);
+      }
+      return duration_cast<Duration>(t4);
+    } catch (const arrow_vendored::date::nonexistent_local_time&) {
+      // In case we hit a nonexistent period we calculate the duration between the
+      // start of nonexistent period and rounded to moment in UTC (nonexistent_offset).
+      // We then floor the beginning of the nonexisting period in local time and add
+      // nonexistent_offset to that time point in UTC.
+      const arrow_vendored::date::local_info li =
+          tz->get_info(local_time<Duration>(duration_cast<Duration>(d2)));
+
+      const Duration t2 =
+          zoned_time<Duration>(tz, local_time<Duration>(duration_cast<Duration>(d2)),
+                               arrow_vendored::date::choose::earliest)
+              .get_sys_time()
+              .time_since_epoch();
+      const Duration nonexistent_offset = duration_cast<Duration>(t2 - d2);
+
+      const Unit unit = Unit{options->multiple};
+      const Unit t3 =
+          (t2.count() >= 0) ? t2 / unit * unit : (t2 - unit + Unit{1}) / unit * unit;
+      return duration_cast<Duration>(t3 + li.second.offset) + nonexistent_offset;
+    }
+  }
+
+  template <typename Duration, typename Unit>
+  Duration CeilTimePoint(const int64_t t, const RoundTemporalOptions* options) const {
+    const Duration d = FloorTimePoint<Duration, Unit>(t, options);
+    if (d.count() == t) {
+      return d;
+    }
+    return FloorTimePoint<Duration, Unit>(
+        t + duration_cast<Duration>(Unit{options->multiple}).count(), options);

Review Comment:
   There definitely is a more efficient way however this PR doesn't change the current situation and optimisation would more logically fit into scope of https://github.com/apache/arrow/pull/13043. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1131534899

   I really dislike that sortedness of list can get destroyed in calendar-based-origin mode as shown by [Joris' example](https://github.com/apache/arrow/pull/12528#issuecomment-1131369183).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1149751125

   I think monotonicity in UTC is important for analytics use cases (such as bucketing). Monotonicity in wall time not so much time.
   
   > Use case would be something like when do users form a certain country come to my website. Rounding in UTC would be inconsistent. I believe it is a needed feature.
   
   I'm not sure I understand precisely what you mean, but is that part of an analytics workload?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1124948943

   @pitrou those are errors when we round from DST to non-DST or reverse. I'm working on it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1124963081

   Yeah, there's an issue there. Let me push out a newer commit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1150984165

   @jorisvandenbossche what's your take on regular and "preserve_wall" behaviours here?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r871534682


##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,260 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(25, CalendarUnit::MINUTE);
+  const char* times = R"([
+    "2022-09-21 18:09:00", "2022-09-21 18:10:00", "2022-09-21 18:11:00",
+    "2022-09-21 18:19:00", "2022-09-21 18:20:00", "2022-09-21 18:21:00",
+    "2022-09-21 18:44:00", "2022-09-21 18:45:00", "2022-09-21 18:46:00",
+    "2022-09-21 19:09:00", "2022-09-21 19:10:00", "2022-09-21 19:11:00",
+    "2022-09-21 19:24:00", "2022-09-21 19:25:00", "2022-09-21 19:26:00",
+    "2022-09-21 19:34:00", "2022-09-21 19:35:00", "2022-09-21 19:36:00",
+    "2022-09-21 19:59:00", "2022-09-21 20:00:00", "2022-09-21 20:01:00",
+    "2022-09-21 20:24:00", "2022-09-21 20:25:00", "2022-09-21 20:26:00",
+    "2022-09-21 20:49:00", "2022-09-21 20:50:00", "2022-09-21 20:51:00"])";
+  const char* times_ceil = R"([
+    "2022-09-21 18:10:00", "2022-09-21 18:10:00", "2022-09-21 18:20:00",

Review Comment:
   We're rounding to origin of `1970-01-01 00:00:00` in local time.
   ```
   import pandas as pd
   m = 1000000000
   epoch_time = pd.to_datetime(["2022-09-21 18:22:10"], utc=True).tz_convert("CET").astype(int) / m
   offset = 3600 * 3
   t = epoch_time / 60 // 16 * 16 * 60 * m
   t2 = ((epoch_time + offset) / 60 // 16 * 16 * 60 - offset) * m 
   print(pd.to_datetime((t2).astype(int)))
   print(pd.to_datetime((t).astype(int)))
   ```
   Will return:
   ```
   DatetimeIndex(['2022-09-21 18:20:00'], dtype='datetime64[ns]', freq=None)
   DatetimeIndex(['2022-09-21 18:08:00'], dtype='datetime64[ns]', freq=None)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r873708485


##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,309 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(25, CalendarUnit::MINUTE);
+  const char* times = R"([
+    "2022-09-21 18:09:00", "2022-09-21 18:10:00", "2022-09-21 18:11:00",
+    "2022-09-21 18:19:00", "2022-09-21 18:20:00", "2022-09-21 18:21:00",
+    "2022-09-21 18:44:00", "2022-09-21 18:45:00", "2022-09-21 18:46:00",
+    "2022-09-21 19:09:00", "2022-09-21 19:10:00", "2022-09-21 19:11:00",
+    "2022-09-21 19:24:00", "2022-09-21 19:25:00", "2022-09-21 19:26:00",
+    "2022-09-21 19:34:00", "2022-09-21 19:35:00", "2022-09-21 19:36:00",
+    "2022-09-21 19:59:00", "2022-09-21 20:00:00", "2022-09-21 20:01:00",
+    "2022-09-21 20:24:00", "2022-09-21 20:25:00", "2022-09-21 20:26:00",
+    "2022-09-21 20:49:00", "2022-09-21 20:50:00", "2022-09-21 20:51:00"])";
+  const char* times_ceil = R"([
+    "2022-09-21 18:10:00", "2022-09-21 18:10:00", "2022-09-21 18:20:00",
+    "2022-09-21 18:20:00", "2022-09-21 18:20:00", "2022-09-21 18:45:00",
+    "2022-09-21 18:45:00", "2022-09-21 18:45:00", "2022-09-21 19:10:00",
+    "2022-09-21 19:10:00", "2022-09-21 19:10:00", "2022-09-21 19:35:00",
+    "2022-09-21 19:35:00", "2022-09-21 19:35:00", "2022-09-21 19:35:00",
+    "2022-09-21 19:35:00", "2022-09-21 19:35:00", "2022-09-21 20:00:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:00:00", "2022-09-21 20:25:00",
+    "2022-09-21 20:25:00", "2022-09-21 20:25:00", "2022-09-21 20:50:00",
+    "2022-09-21 20:50:00", "2022-09-21 20:50:00", "2022-09-21 21:15:00"])";
+  const char* times_floor = R"([
+    "2022-09-21 17:45:00", "2022-09-21 18:10:00", "2022-09-21 18:10:00",
+    "2022-09-21 18:10:00", "2022-09-21 18:10:00", "2022-09-21 18:10:00",
+    "2022-09-21 18:20:00", "2022-09-21 18:45:00", "2022-09-21 18:45:00",
+    "2022-09-21 18:45:00", "2022-09-21 19:10:00", "2022-09-21 19:10:00",
+    "2022-09-21 19:10:00", "2022-09-21 19:10:00", "2022-09-21 19:10:00",
+    "2022-09-21 19:10:00", "2022-09-21 19:35:00", "2022-09-21 19:35:00",
+    "2022-09-21 19:35:00", "2022-09-21 20:00:00", "2022-09-21 20:00:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:25:00", "2022-09-21 20:25:00",
+    "2022-09-21 20:25:00", "2022-09-21 20:50:00", "2022-09-21 20:50:00"])";
+  const char* times_round = R"([
+    "2022-09-21 18:10:00", "2022-09-21 18:10:00", "2022-09-21 18:10:00",
+    "2022-09-21 18:20:00", "2022-09-21 18:20:00", "2022-09-21 18:10:00",
+    "2022-09-21 18:45:00", "2022-09-21 18:45:00", "2022-09-21 18:45:00",
+    "2022-09-21 19:10:00", "2022-09-21 19:10:00", "2022-09-21 19:10:00",
+    "2022-09-21 19:35:00", "2022-09-21 19:35:00", "2022-09-21 19:35:00",
+    "2022-09-21 19:35:00", "2022-09-21 19:35:00", "2022-09-21 19:35:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:00:00", "2022-09-21 20:00:00",
+    "2022-09-21 20:25:00", "2022-09-21 20:25:00", "2022-09-21 20:25:00",
+    "2022-09-21 20:50:00", "2022-09-21 20:50:00", "2022-09-21 20:50:00"])";
+
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
+  CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
+}
+
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous2) {
+  // Europe/Brussels switches from UTC+2:00 to UTC+1:00 on 2018-10-28 03:00:00 UTC+2:00
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::NANO, "Europe/Brussels");
+  auto options = RoundTemporalOptions(25, CalendarUnit::MINUTE);
+  const char* complete_times = R"([
+    "2018-10-27 23:05:00", "2018-10-27 23:06:00", "2018-10-27 23:07:00", "2018-10-27 23:08:00",
+    "2018-10-27 23:09:00", "2018-10-27 23:10:00", "2018-10-27 23:11:00", "2018-10-27 23:12:00",
+    "2018-10-27 23:13:00", "2018-10-27 23:14:00", "2018-10-27 23:15:00", "2018-10-27 23:16:00",
+    "2018-10-27 23:17:00", "2018-10-27 23:18:00", "2018-10-27 23:19:00", "2018-10-27 23:20:00",
+    "2018-10-27 23:21:00", "2018-10-27 23:22:00", "2018-10-27 23:23:00", "2018-10-27 23:24:00",
+    "2018-10-27 23:25:00", "2018-10-27 23:26:00", "2018-10-27 23:27:00", "2018-10-27 23:28:00",
+    "2018-10-27 23:29:00", "2018-10-27 23:30:00", "2018-10-27 23:31:00", "2018-10-27 23:32:00",
+    "2018-10-27 23:33:00", "2018-10-27 23:34:00", "2018-10-27 23:35:00", "2018-10-27 23:36:00",
+    "2018-10-27 23:37:00", "2018-10-27 23:38:00", "2018-10-27 23:39:00", "2018-10-27 23:40:00",
+    "2018-10-27 23:41:00", "2018-10-27 23:42:00", "2018-10-27 23:43:00", "2018-10-27 23:44:00",
+    "2018-10-27 23:45:00", "2018-10-27 23:46:00", "2018-10-27 23:47:00", "2018-10-27 23:48:00",
+    "2018-10-27 23:49:00", "2018-10-27 23:50:00", "2018-10-27 23:51:00", "2018-10-27 23:52:00",
+    "2018-10-27 23:53:00", "2018-10-27 23:54:00", "2018-10-27 23:55:00", "2018-10-27 23:56:00",
+    "2018-10-27 23:57:00", "2018-10-27 23:58:00", "2018-10-27 23:59:00", "2018-10-28 00:00:00",
+    "2018-10-28 00:01:00", "2018-10-28 00:02:00", "2018-10-28 00:03:00", "2018-10-28 00:04:00",
+    "2018-10-28 00:05:00", "2018-10-28 00:06:00", "2018-10-28 00:07:00", "2018-10-28 00:08:00",
+    "2018-10-28 00:09:00", "2018-10-28 00:10:00", "2018-10-28 00:11:00", "2018-10-28 00:12:00",
+    "2018-10-28 00:13:00", "2018-10-28 00:14:00", "2018-10-28 00:15:00", "2018-10-28 00:16:00",
+    "2018-10-28 00:17:00", "2018-10-28 00:18:00", "2018-10-28 00:19:00", "2018-10-28 00:20:00",
+    "2018-10-28 00:21:00", "2018-10-28 00:22:00", "2018-10-28 00:23:00", "2018-10-28 00:24:00",
+    "2018-10-28 00:25:00", "2018-10-28 00:26:00", "2018-10-28 00:27:00", "2018-10-28 00:28:00",
+    "2018-10-28 00:29:00", "2018-10-28 00:30:00", "2018-10-28 00:31:00", "2018-10-28 00:32:00",
+    "2018-10-28 00:33:00", "2018-10-28 00:34:00", "2018-10-28 00:35:00", "2018-10-28 00:36:00",
+    "2018-10-28 00:37:00", "2018-10-28 00:38:00", "2018-10-28 00:39:00", "2018-10-28 00:40:00",
+    "2018-10-28 00:41:00", "2018-10-28 00:42:00", "2018-10-28 00:43:00", "2018-10-28 00:44:00",
+    "2018-10-28 00:45:00", "2018-10-28 00:46:00", "2018-10-28 00:47:00", "2018-10-28 00:48:00",
+    "2018-10-28 00:49:00", "2018-10-28 00:50:00", "2018-10-28 00:51:00", "2018-10-28 00:52:00",
+    "2018-10-28 00:53:00", "2018-10-28 00:54:00", "2018-10-28 00:55:00", "2018-10-28 00:56:00",
+    "2018-10-28 00:57:00", "2018-10-28 00:58:00", "2018-10-28 00:59:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:01:00", "2018-10-28 01:02:00", "2018-10-28 01:03:00", "2018-10-28 01:04:00",
+    "2018-10-28 01:05:00", "2018-10-28 01:06:00", "2018-10-28 01:07:00", "2018-10-28 01:08:00",
+    "2018-10-28 01:09:00", "2018-10-28 01:10:00", "2018-10-28 01:11:00", "2018-10-28 01:12:00",
+    "2018-10-28 01:13:00", "2018-10-28 01:14:00", "2018-10-28 01:15:00", "2018-10-28 01:16:00",
+    "2018-10-28 01:17:00", "2018-10-28 01:18:00", "2018-10-28 01:19:00", "2018-10-28 01:20:00",
+    "2018-10-28 01:21:00", "2018-10-28 01:22:00", "2018-10-28 01:23:00", "2018-10-28 01:24:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:26:00", "2018-10-28 01:27:00", "2018-10-28 01:28:00",
+    "2018-10-28 01:29:00", "2018-10-28 01:30:00", "2018-10-28 01:31:00", "2018-10-28 01:32:00",
+    "2018-10-28 01:33:00", "2018-10-28 01:34:00", "2018-10-28 01:35:00", "2018-10-28 01:36:00",
+    "2018-10-28 01:37:00", "2018-10-28 01:38:00", "2018-10-28 01:39:00", "2018-10-28 01:40:00",
+    "2018-10-28 01:41:00", "2018-10-28 01:42:00", "2018-10-28 01:43:00", "2018-10-28 01:44:00",
+    "2018-10-28 01:45:00", "2018-10-28 01:46:00", "2018-10-28 01:47:00", "2018-10-28 01:48:00",
+    "2018-10-28 01:49:00", "2018-10-28 01:50:00", "2018-10-28 01:51:00", "2018-10-28 01:52:00",
+    "2018-10-28 01:53:00", "2018-10-28 01:54:00", "2018-10-28 01:55:00", "2018-10-28 01:56:00",
+    "2018-10-28 01:57:00", "2018-10-28 01:58:00", "2018-10-28 01:59:00", "2018-10-28 02:00:00",
+    "2018-10-28 02:01:00", "2018-10-28 02:02:00", "2018-10-28 02:03:00", "2018-10-28 02:04:00",
+    "2018-10-28 02:05:00", "2018-10-28 02:06:00", "2018-10-28 02:07:00", "2018-10-28 02:08:00",
+    "2018-10-28 02:09:00", "2018-10-28 02:10:00", "2018-10-28 02:11:00", "2018-10-28 02:12:00",
+    "2018-10-28 02:13:00", "2018-10-28 02:14:00", "2018-10-28 02:15:00", "2018-10-28 02:16:00"])";
+
+  const char* complete_times_floor = R"([
+    "2018-10-27 22:45:00", "2018-10-27 22:45:00", "2018-10-27 22:45:00", "2018-10-27 22:45:00",
+    "2018-10-27 22:45:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 02:15:00", "2018-10-28 02:15:00"])";
+
+  const char* times = R"([
+    "2018-10-27 22:44:00", "2018-10-27 22:45:00", "2018-10-27 22:46:00",
+    "2018-10-27 23:09:00", "2018-10-27 23:10:00", "2018-10-27 23:11:00",
+    "2018-10-27 23:34:00", "2018-10-27 23:35:00", "2018-10-27 23:36:00",
+    "2018-10-27 23:44:00", "2018-10-27 23:45:00", "2018-10-27 23:46:00",
+    "2018-10-27 23:46:00", "2018-10-28 00:00:00", "2018-10-28 00:09:00",
+    "2018-10-28 00:09:00", "2018-10-28 00:10:00", "2018-10-28 00:11:00",
+    "2018-10-28 00:34:00", "2018-10-28 00:35:00", "2018-10-28 00:36:00",
+    "2018-10-28 00:59:00", "2018-10-28 01:00:00", "2018-10-28 01:01:00",
+    "2018-10-28 01:24:00", "2018-10-28 01:25:00", "2018-10-28 01:26:00",
+    "2018-10-28 01:49:00", "2018-10-28 01:50:00", "2018-10-28 01:51:00",
+    "2018-10-28 02:14:00", "2018-10-28 02:15:00", "2018-10-28 02:16:00"])";
+  const char* times_ceil = R"([
+    "2018-10-27 22:45:00", "2018-10-27 22:45:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 02:15:00",
+    "2018-10-28 02:15:00", "2018-10-28 02:15:00", "2018-10-28 02:40:00"])";
+  const char* times_floor = R"([
+    "2018-10-27 22:20:00", "2018-10-27 22:45:00", "2018-10-27 22:45:00",
+    "2018-10-27 22:45:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 02:15:00", "2018-10-28 02:15:00"])";
+  const char* times_round = R"([
+    "2018-10-27 22:45:00", "2018-10-27 22:45:00", "2018-10-27 22:45:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 02:15:00", "2018-10-28 02:15:00", "2018-10-28 02:15:00"])";
+
+  CheckScalarUnary("floor_temporal", unit, complete_times, unit, complete_times_floor,
+                   &options);
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
+  CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
+}
+
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalNonexistent1) {
+  // Asia/Tehran switches from UTC+3:30 to UTC+4:30 on 2022-03-22 00:00:00 UTC+3:30
+  // This causes an hour long non-existing period in local time.
+  auto unit = timestamp(TimeUnit::SECOND, "Asia/Tehran");
+  auto options = RoundTemporalOptions(16, CalendarUnit::MINUTE);
+  const char* times = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:00:00", "2022-03-21 20:30:00"])";
+  const char* times_ceil = R"([
+    "2022-03-21 19:42:00", "2022-03-21 20:14:00", "2022-03-21 20:34:00"])";
+  const char* times_floor = R"([
+    "2022-03-21 19:26:00", "2022-03-21 19:58:00", "2022-03-21 20:18:00"])";

Review Comment:
   This still doesn't work: there are 20 minutes of local time between `2022-03-21 19:58:00` and `2022-03-21 20:18:00`, which is not a multiple of 16, so the two timestamps cannot be _both_ multiples of 16.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1125003206

   > I don't think that is correct, because 3h08 is not a multiple of 16 minutes:
   > 
   > ```
   > >>> divmod(3*60+8, 16)
   > (11, 12)
   > ```
   
   If we do the rounding in local time the offset in UTC changes at DST jumps. I originally generated test cases by rounding just prior DST jump and continued in equal steps through nonexistent time. After the nonexistent time I would continue with regular rounding in local time. I am getting a bit test blind so I could really use help validating current test data or getting new test data :).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1125108653

   @rok The tests don't seem to be exercising the case of rounding by an interval of more than one hour (for example 128 minutes), in which case the input timestamp might be in DST but not the rounded timestamp, or vice-versa...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1118577794

   Yes, I agree with moving them out of scope.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] amol- closed pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by "amol- (via GitHub)" <gi...@apache.org>.
amol- closed pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time
URL: https://github.com/apache/arrow/pull/12528


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r865932532


##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,114 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(1, CalendarUnit::HOUR, /*some_parameter=*/true);
+  const char* times = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:00:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:00:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:30:00", "2022-09-21 21:00:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_ceil = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:30:00", "2022-09-21 20:30:00", "2022-09-21 21:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_floor = R"([
+    "2022-03-21 19:30:00", "2022-03-21 19:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 18:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 19:30:00", "2022-09-21 20:30:00", "2022-09-21 20:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_round = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:30:00", "2022-09-21 19:30:00",

Review Comment:
   Can you choose the examples so that round gives different results from both ceil and floor? Here it seems it's just giving the same results as ceil...



##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,114 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(1, CalendarUnit::HOUR, /*some_parameter=*/true);
+  const char* times = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:00:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:00:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:30:00", "2022-09-21 21:00:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_ceil = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:30:00", "2022-09-21 20:30:00", "2022-09-21 21:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_floor = R"([
+    "2022-03-21 19:30:00", "2022-03-21 19:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 18:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 19:30:00", "2022-09-21 20:30:00", "2022-09-21 20:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_round = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:30:00", "2022-09-21 20:30:00", "2022-09-21 21:30:00",
+    "2022-09-21 21:30:00"])";
+
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
+  CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
+}
+
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous2) {
+  // Europe/Brussels switches from UTC+2:00 to UTC+1:00 on 2022-10-25 03:00:00 UTC+2:00
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::NANO, "Europe/Brussels");
+  auto options = RoundTemporalOptions(15, CalendarUnit::MINUTE, true);

Review Comment:
   Also add parameter name here and below...



##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,114 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
+  // This causes an hour long ambiguous period in local time.

Review Comment:
   Thank you. Why are we also testing 2022-03-21 here?



##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,114 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(1, CalendarUnit::HOUR, /*some_parameter=*/true);
+  const char* times = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:00:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:00:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:30:00", "2022-09-21 21:00:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_ceil = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:30:00", "2022-09-21 20:30:00", "2022-09-21 21:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_floor = R"([
+    "2022-03-21 19:30:00", "2022-03-21 19:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 18:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 19:30:00", "2022-09-21 20:30:00", "2022-09-21 20:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_round = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:30:00", "2022-09-21 20:30:00", "2022-09-21 21:30:00",
+    "2022-09-21 21:30:00"])";
+
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
+  CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
+}
+
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous2) {
+  // Europe/Brussels switches from UTC+2:00 to UTC+1:00 on 2022-10-25 03:00:00 UTC+2:00
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::NANO, "Europe/Brussels");
+  auto options = RoundTemporalOptions(15, CalendarUnit::MINUTE, true);
+  const char* times = R"([
+    "2018-10-28 01:05:00", "2018-10-28 01:20:00", "2018-10-28 01:55:00",
+    "2018-10-28 01:59:59", "2018-10-28 02:00:00", "2018-10-28 02:08:00"])";
+  const char* times_ceil = R"([
+    "2018-10-28 01:15:00", "2018-10-28 01:30:00", "2018-10-28 02:00:00",
+    "2018-10-28 02:00:00", "2018-10-28 02:00:00", "2018-10-28 02:15:00"])";
+  const char* times_floor = R"([
+    "2018-10-28 01:00:00", "2018-10-28 01:15:00", "2018-10-28 01:45:00",
+    "2018-10-28 01:45:00", "2018-10-28 02:00:00", "2018-10-28 02:00:00"])";
+  const char* times_round = R"([
+    "2018-10-28 01:00:00", "2018-10-28 01:15:00", "2018-10-28 02:00:00",
+    "2018-10-28 02:00:00", "2018-10-28 02:00:00", "2018-10-28 02:15:00"])";
+
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
+  CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
+}
+
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalNonexistent1) {
+  // Asia/Tehran switches from UTC+3:30 to UTC+4:30 on 2022-03-22 00:00:00 UTC+3:30
+  // This causes an hour long non-existing period in local time.
+  auto unit = timestamp(TimeUnit::SECOND, "Asia/Tehran");
+  auto options = RoundTemporalOptions(1, CalendarUnit::HOUR, true);
+  const char* times = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:00:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:00:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:30:00", "2022-09-21 21:00:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_ceil = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:30:00", "2022-09-21 20:30:00", "2022-09-21 21:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_floor = R"([
+    "2022-03-21 19:30:00", "2022-03-21 19:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 18:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 19:30:00", "2022-09-21 20:30:00", "2022-09-21 20:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_round = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",

Review Comment:
   Same here: choose the example so that round is different from both floor and ceil?



##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,114 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(1, CalendarUnit::HOUR, /*some_parameter=*/true);
+  const char* times = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:00:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:00:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:30:00", "2022-09-21 21:00:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_ceil = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:30:00", "2022-09-21 20:30:00", "2022-09-21 21:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_floor = R"([
+    "2022-03-21 19:30:00", "2022-03-21 19:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 18:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 19:30:00", "2022-09-21 20:30:00", "2022-09-21 20:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_round = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:30:00", "2022-09-21 20:30:00", "2022-09-21 21:30:00",
+    "2022-09-21 21:30:00"])";
+
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
+  CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
+}
+
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous2) {
+  // Europe/Brussels switches from UTC+2:00 to UTC+1:00 on 2022-10-25 03:00:00 UTC+2:00
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::NANO, "Europe/Brussels");
+  auto options = RoundTemporalOptions(15, CalendarUnit::MINUTE, true);
+  const char* times = R"([
+    "2018-10-28 01:05:00", "2018-10-28 01:20:00", "2018-10-28 01:55:00",
+    "2018-10-28 01:59:59", "2018-10-28 02:00:00", "2018-10-28 02:08:00"])";
+  const char* times_ceil = R"([
+    "2018-10-28 01:15:00", "2018-10-28 01:30:00", "2018-10-28 02:00:00",
+    "2018-10-28 02:00:00", "2018-10-28 02:00:00", "2018-10-28 02:15:00"])";
+  const char* times_floor = R"([
+    "2018-10-28 01:00:00", "2018-10-28 01:15:00", "2018-10-28 01:45:00",
+    "2018-10-28 01:45:00", "2018-10-28 02:00:00", "2018-10-28 02:00:00"])";
+  const char* times_round = R"([
+    "2018-10-28 01:00:00", "2018-10-28 01:15:00", "2018-10-28 02:00:00",
+    "2018-10-28 02:00:00", "2018-10-28 02:00:00", "2018-10-28 02:15:00"])";
+
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
+  CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
+}
+
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalNonexistent1) {
+  // Asia/Tehran switches from UTC+3:30 to UTC+4:30 on 2022-03-22 00:00:00 UTC+3:30
+  // This causes an hour long non-existing period in local time.
+  auto unit = timestamp(TimeUnit::SECOND, "Asia/Tehran");
+  auto options = RoundTemporalOptions(1, CalendarUnit::HOUR, true);
+  const char* times = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:00:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:00:00", "2022-09-21 19:30:00",

Review Comment:
   Why are we also testing 2022-09-21 here?



##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,114 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(1, CalendarUnit::HOUR, /*some_parameter=*/true);

Review Comment:
   Can you put the proper parameter name? :-)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1124975055

   Agh, this still didn't fix the Python issue. Will fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1124973647

   @jorisvandenbossche @pitrou pushed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r871540501


##########
cpp/src/arrow/compute/kernels/temporal_internal.h:
##########
@@ -119,19 +178,109 @@ struct ZonedLocalizer {
     return tz->to_local(sys_time<Duration>(Duration{t}));
   }
 
+  template <typename Duration, typename Unit>
+  Duration FloorTimePoint(const int64_t t, const RoundTemporalOptions& options) const {
+    const local_time<Duration> lt = tz->to_local(sys_time<Duration>(Duration{t}));

Review Comment:
   If you mean `arrow_vendored::date::choose::earliest` then the result is the beginning of an ambiguous period which is not really a multiple of a time unit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r873699687


##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,260 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(25, CalendarUnit::MINUTE);
+  const char* times = R"([
+    "2022-09-21 18:09:00", "2022-09-21 18:10:00", "2022-09-21 18:11:00",
+    "2022-09-21 18:19:00", "2022-09-21 18:20:00", "2022-09-21 18:21:00",
+    "2022-09-21 18:44:00", "2022-09-21 18:45:00", "2022-09-21 18:46:00",
+    "2022-09-21 19:09:00", "2022-09-21 19:10:00", "2022-09-21 19:11:00",
+    "2022-09-21 19:24:00", "2022-09-21 19:25:00", "2022-09-21 19:26:00",
+    "2022-09-21 19:34:00", "2022-09-21 19:35:00", "2022-09-21 19:36:00",
+    "2022-09-21 19:59:00", "2022-09-21 20:00:00", "2022-09-21 20:01:00",
+    "2022-09-21 20:24:00", "2022-09-21 20:25:00", "2022-09-21 20:26:00",
+    "2022-09-21 20:49:00", "2022-09-21 20:50:00", "2022-09-21 20:51:00"])";
+  const char* times_ceil = R"([
+    "2022-09-21 18:10:00", "2022-09-21 18:10:00", "2022-09-21 18:20:00",

Review Comment:
   I'm not sure this matches the example? `2022-09-21 18:22:10` is not present here, also the offset is 3h30 not 3h.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1126610492

   > BTW, I actually expected that the non-existing time case would still raise an error by default (based on [#12528 (comment)](https://github.com/apache/arrow/pull/12528#issuecomment-1062826028)). Given that this is a much more corner case than the ambiguous time, I think it is fine that this isn't handled automatically.
   
   Would still be nice to have I suppose.
   
   > If we want to have some automatic inference of what the resulting time should be for the non-existent case: I suppose we still need to round in local time, but if you end up with a non-existent time like "02:56:00" like in the example above, in this case we should actually choose "01:56:00" as the correct local time. The logic to obtain this could be to check the earliest and latest valid time ("02:00" and "03:00"), check the difference between "01:56" and the latest, and then subtract that from the earliest time.
   
   Thanks for the suggestion that simplifies what I was doing earlier!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1126610722

   @pitrou @jorisvandenbossche I think I addressed everything now. I think it's *time* for another pass.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jorisvandenbossche commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1131361871

   Something else, coming back to the previous example (https://github.com/apache/arrow/pull/12528#issuecomment-1124954128) and the logic how to resolve nonexistent times (https://github.com/apache/arrow/pull/12528#issuecomment-1125068996), there is still something wrong in that logic, I think. 
   
   Let's consider the DST jump in 2015 for CE(S)T, and a time 10min before, just before (1s) and at the jump, and 10min after the jump (the jump is from 02:00 to 03:00):
   
   ```
   arr = pa.array(["2015-03-29 01:50:00", "2015-03-29 01:59:59", "2015-03-29 03:00:00", "2015-03-29 03:10:00"]).cast(pa.timestamp("us"))
   arr = pc.assume_timezone(arr, "Europe/Brussels")
   
   In [35]: arr
   Out[35]: 
   <pyarrow.lib.TimestampArray object at 0x7fddd9b1ab80>
   [
     2015-03-29 00:50:00.000000,
     2015-03-29 00:59:59.000000,
     2015-03-29 01:00:00.000000,
     2015-03-29 01:10:00.000000
   ]
   
   In [36]: arr.to_pandas()
   Out[36]: 
   0   2015-03-29 01:50:00+01:00
   1   2015-03-29 01:59:59+01:00
   2   2015-03-29 03:00:00+02:00
   3   2015-03-29 03:10:00+02:00
   dtype: datetime64[ns, Europe/Brussels]
   ```
   
   Now, if we round this to 16min:
   
   ```
   In [37]: pc.round_temporal(arr, 16, "minute")
   Out[37]: 
   <pyarrow.lib.TimestampArray object at 0x7fddd9b24fa0>
   [
     2015-03-29 00:52:00.000000,
     2015-03-29 00:08:00.000000,
     2015-03-29 00:56:00.000000,
     2015-03-29 01:12:00.000000
   ]
   
   In [38]: pc.round_temporal(arr, 16, "minute").to_pandas()
   Out[38]: 
   0   2015-03-29 01:52:00+01:00
   1   2015-03-29 01:08:00+01:00
   2   2015-03-29 01:56:00+01:00
   3   2015-03-29 03:12:00+02:00
   dtype: datetime64[ns, Europe/Brussels]
   ```
   
   So the second value in the result is clearly wrong since the absolute difference between original and rounded should never be more than the rounding unit (and in case of rounding and not ceil/floor, actually never more than half of the rounding unit, so 8min in this case):
   
   ```
   In [43]: pc.subtract(arr, pc.round_temporal(arr, 16, "minute")).to_pandas().abs()
   Out[43]: 
   0   0 days 00:02:00
   1   0 days 00:51:59
   2   0 days 00:04:00
   3   0 days 00:02:00
   dtype: timedelta64[ns]
   ```
   
   This time of "01:59:59+01:00" was rounded in local time to the non-existent 02:08, which was then corrected to 01:08. So that's a flaw in the logic that is implemented (that I proposed?).
   
   But also in general, the question is: _what is actually the expected / desired result in the first place?_ 
   Because the first and third value end up as "01:52:00+01:00" and "01:56:00+01:00" after rounding. But those two values are only 4 min apart, while the rounding was by 16min. So you would expect that every possible value in the result is 16min apart from each other.
   
   I think this comes back to: do we want to result to be rounded in "physical time" (so the results are actually 16 min apart), or in "wall clock time"? 
   Generally we round in local wall clock time, but when there are DST jumps, that will then cause times that are not actually the rounding unit apart from each other (only in case the rounding unit is not an exact divider of the DST jump, like the 16min here).
   
   (it might be worth checking how other software does this, if there is software that actually handles those cases .. Eg what does lubridate do?)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jorisvandenbossche commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1131528343

   Ah, indeed (I was assuming that in case of 16min it doesn't matter because a day is divisible in 16 min parts, so if you start to count from the start of that day or not doesn't matter. But the "higher" calendar item for minute is of course hour and not day ..)
   
   Using the branch from https://github.com/apache/arrow/pull/12657 (and using naive times) shows this:
   
   ```
   In [5]: arr = pa.array(["2015-03-29 01:50:00", "2015-03-29 01:59:59", "2015-03-29 03:00:00", "2015-03-29 03:10:00"]).cast(pa.timestamp("us"))
   
   In [6]: arr
   Out[6]: 
   <pyarrow.lib.TimestampArray object at 0x7f40eb249ee0>
   [
     2015-03-29 01:50:00.000000,
     2015-03-29 01:59:59.000000,
     2015-03-29 03:00:00.000000,
     2015-03-29 03:10:00.000000
   ]
   
   In [7]: pc.round_temporal(arr, 16, "minute")
   Out[7]: 
   <pyarrow.lib.TimestampArray object at 0x7f40ea1ce7c0>
   [
     2015-03-29 01:52:00.000000,
     2015-03-29 01:52:00.000000,
     2015-03-29 02:56:00.000000,
     2015-03-29 03:12:00.000000
   ]
   
   In [8]: pc.round_temporal(arr, 16, "minute", calendar_based_origin=True)
   Out[8]: 
   <pyarrow.lib.TimestampArray object at 0x7f40ea1ce700>
   [
     2015-03-29 01:48:00.000000,
     2015-03-29 02:04:00.000000,
     2015-03-29 03:00:00.000000,
     2015-03-29 03:16:00.000000
   ]
   ```
   
   Although I would say that the second result value "02:04:00" seems off? (if it should start to count from the hour, it should either be "02:00" or "02:16"; but in this case it is still counting from the hour before ("01:00" + 4*16 min)) But that is something for the other PR (and actually consistent with lubridate ..)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1145552268

   I've rebased and refactored a bit because of #12657. Logic around DST jumps is almost ready.
   
   I've also introduced a flag (sigh) `RoundTemporalOptions.preserve_wall_time_order` that flattens first ambiguous fold for floor (and should second for ceil) and would look like so:
   ![image](https://user-images.githubusercontent.com/54589/171780443-578900d8-9178-4312-8e24-0cd3238fa19c.png)
   The idea is we preserve UTC order by default, but with this flag we offer a way to preserve wall order as well.
   @jorisvandenbossche @pitrou any thoughts on this? Perhaps some other strategy we can take?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jorisvandenbossche commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1125068996

   BTW, I actually expected that the non-existing time case would still raise an error by default (based on https://github.com/apache/arrow/pull/12528#issuecomment-1062826028). Given that this is a much more corner case than the ambiguous time, I think it is fine that this isn't handled automatically.
   
   If we want to have some automatic inference of what the resulting time should be for the non-existent case: I suppose we still need to round in local time, but if you end up with a non-existent time like "02:56:00" like in the example above, in this case we should actually choose "01:56:00" as the correct local time. 
   The logic to obtain this could be to check the earliest and latest valid time ("02:00" and "03:00"), check the difference between "01:56" and the latest, and then subtract that from the earliest time. 
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r869811157


##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,114 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(1, CalendarUnit::HOUR, /*some_parameter=*/true);
+  const char* times = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:00:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:00:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:30:00", "2022-09-21 21:00:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_ceil = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:30:00", "2022-09-21 20:30:00", "2022-09-21 21:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_floor = R"([
+    "2022-03-21 19:30:00", "2022-03-21 19:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 18:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 19:30:00", "2022-09-21 20:30:00", "2022-09-21 20:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_round = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:30:00", "2022-09-21 19:30:00",

Review Comment:
   @pitrou I added a bit of tests, please see if the results of tests for ambiguous and nonexistent make sense.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1123105912

   @pitrou could you please do another pass?
   FloorTimePoint, RoundTimePoint and CeilTimePoint could now be templated with Floor/Round/CeilHelper functions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jorisvandenbossche commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1124954128

   I was also just trying my example of above (https://github.com/apache/arrow/pull/12528#issuecomment-1055262792) that results in a non-existing time. But now that actually returns something instead of erroring:
   
   ```
   In [21]: arr = pc.assume_timezone(pa.array([pd.Timestamp("2015-03-29 02:30:00")]), "Europe/Brussels", nonexistent="latest")
       ...: pc.round_temporal(arr, 16, "minute")
   Out[21]: 
   <pyarrow.lib.TimestampArray object at 0x7f62c8fc7e80>
   [
     2015-03-29 01:08:00.000000
   ]
   ```
   
   I am not fully sure that is correct? Previously it tried to round in naive time and then gave "2015-03-29 02:56:00", which doesn't exist. But I don't see how we would get to the result above (which is "2015-03-29 03:08:00" in local naive time)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r871496376


##########
cpp/src/arrow/compute/kernels/temporal_internal.h:
##########
@@ -119,19 +178,109 @@ struct ZonedLocalizer {
     return tz->to_local(sys_time<Duration>(Duration{t}));
   }
 
+  template <typename Duration, typename Unit>
+  Duration FloorTimePoint(const int64_t t, const RoundTemporalOptions& options) const {
+    const local_time<Duration> lt = tz->to_local(sys_time<Duration>(Duration{t}));

Review Comment:
   Since we are flooring, would it be correct to always adopt the `AMBIGUOUS_EARLIEST` strategy?
   (similar question with ceiling)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1125149935

   > @rok The tests don't seem to be exercising the case of rounding by an interval of more than one hour (for example 128 minutes), in which case the input timestamp might be in DST but not the rounded timestamp, or vice-versa...
   
   Got it. I'll take a look.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r873700820


##########
cpp/src/arrow/compute/kernels/temporal_internal.h:
##########
@@ -119,19 +178,109 @@ struct ZonedLocalizer {
     return tz->to_local(sys_time<Duration>(Duration{t}));
   }
 
+  template <typename Duration, typename Unit>
+  Duration FloorTimePoint(const int64_t t, const RoundTemporalOptions& options) const {
+    const local_time<Duration> lt = tz->to_local(sys_time<Duration>(Duration{t}));

Review Comment:
   Sure, I mean convert with `earliest`, then floor to the configured multiple of time unit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1149397704

   @pitrou @jorisvandenbossche I'd like to propose behaviour as sketched below. More examples in [this gist](https://gist.github.com/rok/0662537e66684b8417443e38bfa3bce1).
   ![image](https://user-images.githubusercontent.com/54589/172521064-9990874f-3a4f-4fcb-808d-5b5c18cc9006.png)
   ![image](https://user-images.githubusercontent.com/54589/172521129-ee7f4c51-b21f-4f54-8a8a-f59d7ddc85e0.png)
   
   The logic used is as currently in this PR.
   The remaining task is writing Python logic to test against.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] ursabot commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
ursabot commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1112848848

   Benchmark runs are scheduled for baseline = 564eb97a0995f8d896af0ba67239fa010f301c9b and contender = ec8cbd2aed0161fb03f3c6d5d3c7493832efc0e7. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Skipped :warning: Only ['lang', 'name'] filters are supported on ec2-t3-xlarge-us-east-2] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/d14b907f4ce34c979d5c053492e77da5...76b60fcc1e4848b8ad351317e877a942/)
   [Skipped :warning: Only ['lang', 'name'] filters are supported on test-mac-arm] [test-mac-arm](https://conbench.ursa.dev/compare/runs/1b4ae261d51d4804858b3cd14f6c9068...31ffc20ffef0478cab1298bfc34646c5/)
   [Skipped :warning: Only ['lang', 'name'] filters are supported on ursa-i9-9960x] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/2c02e87fa4104c5dabbd271f90991da0...e4223c8c0e5a42cc944cd1af4f880308/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/2d0fa9d790474ab582300f6d373b80e4...f3085f82f4c749868d5f054f4214eaf5/)
   Buildkite builds:
   [Scheduled] [`ec8cbd2a` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/652)
   [Finished] [`564eb97a` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/666)
   [Scheduled] [`564eb97a` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/658)
   [Scheduled] [`564eb97a` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/603)
   [Scheduled] [`564eb97a` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/651)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1131567401

   > But also in general, the question is: _what is actually the expected / desired result in the first place?_ Because the first and third value end up as "01:52:00+01:00" and "01:56:00+01:00" after rounding. But those two values are only 4 min apart, while the rounding was by 16min. So you would expect that every possible value in the result is 16min apart from each other.
   > 
   > I think this comes back to: do we want to result to be rounded in "physical time" (so the results are actually 16 min apart), or in "wall clock time"? Generally we round in local wall clock time, but when there are DST jumps, that will then cause times that are not actually the rounding unit apart from each other (only in case the rounding unit is not an exact divider of the DST jump, like the 16min here).
   
   Because of DST jumps pretty calendar rounding in wall time is impossible without having two "correction intervals" a year. If a user wants physical time rounding they can still fall back to `round(utc_time + offset) - offset`.
   
   I don't know much about user expectations beyond precedents by Pandas and lubridate, but we should probably have idempotency, maintain sortedness of "timestamp continuum" in UTC and maintain sortedness in local time. That might constrain the problem enough to only have one solution? :)
   
   Also inside a DST jump we can choose to align the start or the end of the jump with the rounding outside of the jump. Right now we (I think) make this choice based on basis of wether we're doing a ceil or a floor. My worry here is this might not preserve the sortnedness (e.g. flooring to 16 minutes will in some cases floor to 4minutes before the start of a DST jump).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1140040492

   I've added Python tests to test the C++ implementation. Here are some visualisations of current behaviour:
   ![image](https://user-images.githubusercontent.com/54589/170787653-c8258af5-9e46-496f-8f1a-001161908391.png)
   ![image](https://user-images.githubusercontent.com/54589/170787676-757efed9-9852-45ea-bd33-7da30f153aac.png)
   
   And here's the Jupyter notebook used to generate visualisations: https://gist.github.com/rok/95140c58b4e516508866cf1367071e81
   
   @jorisvandenbossche adding `roll-forward/backward` [like clock](https://clock.r-lib.org/reference/posixt-rounding.html) might be a good idea. I don't like `shift-forward/backward` so much. We previously dismissed having multiple settings though. However it might be worth to reopened that to enable sortedness-maintaining and sortedness-destroying behaviour. Possible options:
   * Real local rounding (sortedness-destroying), currently implemented
   * roll-forward/backward (sortedness-maintaining)
   * NA
   * Throw error
   
   Please also note current behaviour has some odd jitter due to rounding from different offsets, see around 1am below:
   ![image](https://user-images.githubusercontent.com/54589/170789181-0434c5e7-8353-457a-b9fa-811f262d2461.png)
   I'm not sure there is an obvious way to address it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1124922624

   @rok Can you take a look at the Python CI failures? They look related.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r871534682


##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,260 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(25, CalendarUnit::MINUTE);
+  const char* times = R"([
+    "2022-09-21 18:09:00", "2022-09-21 18:10:00", "2022-09-21 18:11:00",
+    "2022-09-21 18:19:00", "2022-09-21 18:20:00", "2022-09-21 18:21:00",
+    "2022-09-21 18:44:00", "2022-09-21 18:45:00", "2022-09-21 18:46:00",
+    "2022-09-21 19:09:00", "2022-09-21 19:10:00", "2022-09-21 19:11:00",
+    "2022-09-21 19:24:00", "2022-09-21 19:25:00", "2022-09-21 19:26:00",
+    "2022-09-21 19:34:00", "2022-09-21 19:35:00", "2022-09-21 19:36:00",
+    "2022-09-21 19:59:00", "2022-09-21 20:00:00", "2022-09-21 20:01:00",
+    "2022-09-21 20:24:00", "2022-09-21 20:25:00", "2022-09-21 20:26:00",
+    "2022-09-21 20:49:00", "2022-09-21 20:50:00", "2022-09-21 20:51:00"])";
+  const char* times_ceil = R"([
+    "2022-09-21 18:10:00", "2022-09-21 18:10:00", "2022-09-21 18:20:00",

Review Comment:
   We're rounding to origin of `1970-01-01 00:00:00` in local time.
   ```
   import pandas as pd
   m = 1000000000
   epoch_time = pd.to_datetime(["2022-09-21 18:22:10"], utc=True).astype(int) / m
   offset = 3600 * 3
   t = epoch_time / 60 // 16 * 16 * 60 * m
   t2 = ((epoch_time + offset) / 60 // 16 * 16 * 60 - offset) * m 
   print(pd.to_datetime((t2).astype(int)))
   print(pd.to_datetime((t).astype(int)))
   ```
   Will return:
   ```
   DatetimeIndex(['2022-09-21 18:20:00'], dtype='datetime64[ns]', freq=None)
   DatetimeIndex(['2022-09-21 18:08:00'], dtype='datetime64[ns]', freq=None)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jorisvandenbossche commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1130093279

   Personally, for me it would also help to have a few hardcoded tests, but with some specific cases and with comments explaining what (and why) the result should be (like "this time is ... in local time, rounding this gives ..., this is ambigous/nonexistent, so taking .. instead", and that for a few example cases) 
   (for example you could take the ones that Antoine was commenting on, on the example case that we used above to discuss (https://github.com/apache/arrow/pull/12528#issuecomment-1124954128)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jorisvandenbossche commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1131644858

   > I really dislike that sortedness of list can get destroyed in calendar-based-origin mode as shown by https://github.com/apache/arrow/pull/12528#issuecomment-1131369183.
   
   Yes, indeed, that's also what I found "off" about it .. But I suppose that is a caveat of this kind of rounding? (and a good reason we don't do that by default) 
   (and it also doesn't have the idempotency criterion ..)
   
   > I don't know much about user expectations beyond precedents by Pandas and lubridate, but we should probably have idempotency, maintain sortedness of "timestamp continuum" in UTC and maintain sortedness in local time. That might constrain the problem enough to only have one solution? :)
   
   That sounds as a good starting point. 
   
   Considering those 4 example times around a DST jump again: 
   
   ```
   "01:50:00+01:00", 
   "01:59:59+01:00", 
   "03:00:00+02:00", 
   "03:10:00+02:00"
   ```
   
   I can currently think of two options: 1) something that gets rounded "into the jump" as a nonexistent time gets moved to the border of the jump (start or end of the jump doesn't really matter, as this is the same point in time, in practice this is represented as the end. Also for floor vs ceil I wouldn't do anything different). In that case we get something like:
   
   ```
   data -> rounded in local naive time -> with timezone
   "01:50:00+01:00" -> "01:52:00" -> "01:52:00+01:00"
   "01:59:59+01:00" -> "01:52:00" -> "01:52:00+01:00"
   "03:00:00+02:00" -> "02:56:00" -> "03:00:00+02:00"
   "03:10:00+02:00" -> "03:12:00" -> "03:12:00+02:00"
   ```
   
   Or otherwise 2) something that gets rounded into a jump as a nonexistent time gets moved to the "closest" rounded value (that would otherwise occur) outside of the jump:
   
   ```
   data -> rounded in local naive time -> with timezone
   "01:50:00+01:00" -> "01:52:00" -> "01:52:00+01:00"
   "01:59:59+01:00" -> "01:52:00" -> "01:52:00+01:00"
   "03:00:00+02:00" -> "02:56:00" -> "03:12:00+02:00"  <-- only this one is different
   "03:10:00+02:00" -> "03:12:00" -> "03:12:00+02:00"
   ```
   
   In this case floor vs ceil could use the rounded value before vs after.
   
   Both cases preserve the sortedness, and are idempotent. In this example at least; I don't know if we could come up with an example where the value at the jump (in this case "03:00:00" would not round to itself. Probably this is possible by playing with the exact multiple, in which case this is a reason to maybe go for option 2.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r872912757


##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,260 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(25, CalendarUnit::MINUTE);
+  const char* times = R"([
+    "2022-09-21 18:09:00", "2022-09-21 18:10:00", "2022-09-21 18:11:00",
+    "2022-09-21 18:19:00", "2022-09-21 18:20:00", "2022-09-21 18:21:00",
+    "2022-09-21 18:44:00", "2022-09-21 18:45:00", "2022-09-21 18:46:00",
+    "2022-09-21 19:09:00", "2022-09-21 19:10:00", "2022-09-21 19:11:00",
+    "2022-09-21 19:24:00", "2022-09-21 19:25:00", "2022-09-21 19:26:00",
+    "2022-09-21 19:34:00", "2022-09-21 19:35:00", "2022-09-21 19:36:00",
+    "2022-09-21 19:59:00", "2022-09-21 20:00:00", "2022-09-21 20:01:00",
+    "2022-09-21 20:24:00", "2022-09-21 20:25:00", "2022-09-21 20:26:00",
+    "2022-09-21 20:49:00", "2022-09-21 20:50:00", "2022-09-21 20:51:00"])";
+  const char* times_ceil = R"([
+    "2022-09-21 18:10:00", "2022-09-21 18:10:00", "2022-09-21 18:20:00",
+    "2022-09-21 18:20:00", "2022-09-21 18:20:00", "2022-09-21 18:45:00",
+    "2022-09-21 18:45:00", "2022-09-21 18:45:00", "2022-09-21 19:10:00",
+    "2022-09-21 19:10:00", "2022-09-21 19:10:00", "2022-09-21 19:35:00",
+    "2022-09-21 19:35:00", "2022-09-21 19:35:00", "2022-09-21 19:35:00",
+    "2022-09-21 19:35:00", "2022-09-21 19:35:00", "2022-09-21 20:00:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:00:00", "2022-09-21 20:25:00",
+    "2022-09-21 20:25:00", "2022-09-21 20:25:00", "2022-09-21 20:50:00",
+    "2022-09-21 20:50:00", "2022-09-21 20:50:00", "2022-09-21 21:15:00"])";
+  const char* times_floor = R"([
+    "2022-09-21 17:45:00", "2022-09-21 18:10:00", "2022-09-21 18:10:00",
+    "2022-09-21 18:10:00", "2022-09-21 18:10:00", "2022-09-21 18:10:00",
+    "2022-09-21 18:20:00", "2022-09-21 18:45:00", "2022-09-21 18:45:00",
+    "2022-09-21 18:45:00", "2022-09-21 19:10:00", "2022-09-21 19:10:00",
+    "2022-09-21 19:10:00", "2022-09-21 19:10:00", "2022-09-21 19:10:00",
+    "2022-09-21 19:10:00", "2022-09-21 19:35:00", "2022-09-21 19:35:00",
+    "2022-09-21 19:35:00", "2022-09-21 20:00:00", "2022-09-21 20:00:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:25:00", "2022-09-21 20:25:00",
+    "2022-09-21 20:25:00", "2022-09-21 20:50:00", "2022-09-21 20:50:00"])";
+  const char* times_round = R"([
+    "2022-09-21 18:10:00", "2022-09-21 18:10:00", "2022-09-21 18:10:00",
+    "2022-09-21 18:10:00", "2022-09-21 18:10:00", "2022-09-21 18:10:00",
+    "2022-09-21 18:45:00", "2022-09-21 18:45:00", "2022-09-21 18:45:00",
+    "2022-09-21 19:10:00", "2022-09-21 19:10:00", "2022-09-21 19:10:00",
+    "2022-09-21 19:35:00", "2022-09-21 19:35:00", "2022-09-21 19:35:00",
+    "2022-09-21 19:35:00", "2022-09-21 19:35:00", "2022-09-21 19:35:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:00:00", "2022-09-21 20:00:00",
+    "2022-09-21 20:25:00", "2022-09-21 20:25:00", "2022-09-21 20:25:00",
+    "2022-09-21 20:50:00", "2022-09-21 20:50:00", "2022-09-21 20:50:00"])";
+
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
+  CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
+}
+
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous2) {
+  // Europe/Brussels switches from UTC+2:00 to UTC+1:00 on 2022-10-28 03:00:00 UTC+2:00
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::NANO, "Europe/Brussels");
+  auto options = RoundTemporalOptions(25, CalendarUnit::MINUTE);
+  const char* complete_times = R"([
+    "2018-10-27 23:05:00", "2018-10-27 23:06:00", "2018-10-27 23:07:00", "2018-10-27 23:08:00",
+    "2018-10-27 23:09:00", "2018-10-27 23:10:00", "2018-10-27 23:11:00", "2018-10-27 23:12:00",
+    "2018-10-27 23:13:00", "2018-10-27 23:14:00", "2018-10-27 23:15:00", "2018-10-27 23:16:00",
+    "2018-10-27 23:17:00", "2018-10-27 23:18:00", "2018-10-27 23:19:00", "2018-10-27 23:20:00",
+    "2018-10-27 23:21:00", "2018-10-27 23:22:00", "2018-10-27 23:23:00", "2018-10-27 23:24:00",
+    "2018-10-27 23:25:00", "2018-10-27 23:26:00", "2018-10-27 23:27:00", "2018-10-27 23:28:00",
+    "2018-10-27 23:29:00", "2018-10-27 23:30:00", "2018-10-27 23:31:00", "2018-10-27 23:32:00",
+    "2018-10-27 23:33:00", "2018-10-27 23:34:00", "2018-10-27 23:35:00", "2018-10-27 23:36:00",
+    "2018-10-27 23:37:00", "2018-10-27 23:38:00", "2018-10-27 23:39:00", "2018-10-27 23:40:00",
+    "2018-10-27 23:41:00", "2018-10-27 23:42:00", "2018-10-27 23:43:00", "2018-10-27 23:44:00",
+    "2018-10-27 23:45:00", "2018-10-27 23:46:00", "2018-10-27 23:47:00", "2018-10-27 23:48:00",
+    "2018-10-27 23:49:00", "2018-10-27 23:50:00", "2018-10-27 23:51:00", "2018-10-27 23:52:00",
+    "2018-10-27 23:53:00", "2018-10-27 23:54:00", "2018-10-27 23:55:00", "2018-10-27 23:56:00",
+    "2018-10-27 23:57:00", "2018-10-27 23:58:00", "2018-10-27 23:59:00", "2018-10-28 00:00:00",
+    "2018-10-28 00:01:00", "2018-10-28 00:02:00", "2018-10-28 00:03:00", "2018-10-28 00:04:00",
+    "2018-10-28 00:05:00", "2018-10-28 00:06:00", "2018-10-28 00:07:00", "2018-10-28 00:08:00",
+    "2018-10-28 00:09:00", "2018-10-28 00:10:00", "2018-10-28 00:11:00", "2018-10-28 00:12:00",
+    "2018-10-28 00:13:00", "2018-10-28 00:14:00", "2018-10-28 00:15:00", "2018-10-28 00:16:00",
+    "2018-10-28 00:17:00", "2018-10-28 00:18:00", "2018-10-28 00:19:00", "2018-10-28 00:20:00",
+    "2018-10-28 00:21:00", "2018-10-28 00:22:00", "2018-10-28 00:23:00", "2018-10-28 00:24:00",
+    "2018-10-28 00:25:00", "2018-10-28 00:26:00", "2018-10-28 00:27:00", "2018-10-28 00:28:00",
+    "2018-10-28 00:29:00", "2018-10-28 00:30:00", "2018-10-28 00:31:00", "2018-10-28 00:32:00",
+    "2018-10-28 00:33:00", "2018-10-28 00:34:00", "2018-10-28 00:35:00", "2018-10-28 00:36:00",
+    "2018-10-28 00:37:00", "2018-10-28 00:38:00", "2018-10-28 00:39:00", "2018-10-28 00:40:00",
+    "2018-10-28 00:41:00", "2018-10-28 00:42:00", "2018-10-28 00:43:00", "2018-10-28 00:44:00",
+    "2018-10-28 00:45:00", "2018-10-28 00:46:00", "2018-10-28 00:47:00", "2018-10-28 00:48:00",
+    "2018-10-28 00:49:00", "2018-10-28 00:50:00", "2018-10-28 00:51:00", "2018-10-28 00:52:00",
+    "2018-10-28 00:53:00", "2018-10-28 00:54:00", "2018-10-28 00:55:00", "2018-10-28 00:56:00",
+    "2018-10-28 00:57:00", "2018-10-28 00:58:00", "2018-10-28 00:59:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:01:00", "2018-10-28 01:02:00", "2018-10-28 01:03:00", "2018-10-28 01:04:00",
+    "2018-10-28 01:05:00", "2018-10-28 01:06:00", "2018-10-28 01:07:00", "2018-10-28 01:08:00",
+    "2018-10-28 01:09:00", "2018-10-28 01:10:00", "2018-10-28 01:11:00", "2018-10-28 01:12:00",
+    "2018-10-28 01:13:00", "2018-10-28 01:14:00", "2018-10-28 01:15:00", "2018-10-28 01:16:00",
+    "2018-10-28 01:17:00", "2018-10-28 01:18:00", "2018-10-28 01:19:00", "2018-10-28 01:20:00",
+    "2018-10-28 01:21:00", "2018-10-28 01:22:00", "2018-10-28 01:23:00", "2018-10-28 01:24:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:26:00", "2018-10-28 01:27:00", "2018-10-28 01:28:00",
+    "2018-10-28 01:29:00", "2018-10-28 01:30:00", "2018-10-28 01:31:00", "2018-10-28 01:32:00",
+    "2018-10-28 01:33:00", "2018-10-28 01:34:00", "2018-10-28 01:35:00", "2018-10-28 01:36:00",
+    "2018-10-28 01:37:00", "2018-10-28 01:38:00", "2018-10-28 01:39:00", "2018-10-28 01:40:00",
+    "2018-10-28 01:41:00", "2018-10-28 01:42:00", "2018-10-28 01:43:00", "2018-10-28 01:44:00",
+    "2018-10-28 01:45:00", "2018-10-28 01:46:00", "2018-10-28 01:47:00", "2018-10-28 01:48:00",
+    "2018-10-28 01:49:00", "2018-10-28 01:50:00", "2018-10-28 01:51:00", "2018-10-28 01:52:00",
+    "2018-10-28 01:53:00", "2018-10-28 01:54:00", "2018-10-28 01:55:00", "2018-10-28 01:56:00",
+    "2018-10-28 01:57:00", "2018-10-28 01:58:00", "2018-10-28 01:59:00", "2018-10-28 02:00:00",
+    "2018-10-28 02:01:00", "2018-10-28 02:02:00", "2018-10-28 02:03:00", "2018-10-28 02:04:00",
+    "2018-10-28 02:05:00", "2018-10-28 02:06:00", "2018-10-28 02:07:00", "2018-10-28 02:08:00",
+    "2018-10-28 02:09:00", "2018-10-28 02:10:00", "2018-10-28 02:11:00", "2018-10-28 02:12:00",
+    "2018-10-28 02:13:00", "2018-10-28 02:14:00", "2018-10-28 02:15:00", "2018-10-28 02:16:00"])";
+
+  const char* complete_times_floor = R"([
+    "2018-10-27 22:45:00", "2018-10-27 22:45:00", "2018-10-27 22:45:00", "2018-10-27 22:45:00",
+    "2018-10-27 22:45:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 02:15:00", "2018-10-28 02:15:00"])";
+
+  const char* times = R"([
+    "2018-10-27 22:44:00", "2018-10-27 22:45:00", "2018-10-27 22:46:00",
+    "2018-10-27 23:09:00", "2018-10-27 23:10:00", "2018-10-27 23:11:00",
+    "2018-10-27 23:34:00", "2018-10-27 23:35:00", "2018-10-27 23:36:00",
+    "2018-10-27 23:44:00", "2018-10-27 23:45:00", "2018-10-27 23:46:00",
+    "2018-10-27 23:46:00", "2018-10-28 00:00:00", "2018-10-28 00:09:00",
+    "2018-10-28 00:09:00", "2018-10-28 00:10:00", "2018-10-28 00:11:00",
+    "2018-10-28 00:34:00", "2018-10-28 00:35:00", "2018-10-28 00:36:00",
+    "2018-10-28 00:59:00", "2018-10-28 01:00:00", "2018-10-28 01:01:00",
+    "2018-10-28 01:24:00", "2018-10-28 01:25:00", "2018-10-28 01:26:00",
+    "2018-10-28 01:49:00", "2018-10-28 01:50:00", "2018-10-28 01:51:00",
+    "2018-10-28 02:14:00", "2018-10-28 02:15:00", "2018-10-28 02:16:00"])";
+  const char* times_ceil = R"([
+    "2018-10-27 22:45:00", "2018-10-27 22:45:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 02:15:00",
+    "2018-10-28 02:15:00", "2018-10-28 02:15:00", "2018-10-28 02:40:00"])";
+  const char* times_floor = R"([
+    "2018-10-27 22:20:00", "2018-10-27 22:45:00", "2018-10-27 22:45:00",
+    "2018-10-27 22:45:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 02:15:00", "2018-10-28 02:15:00"])";
+  const char* times_round = R"([
+    "2018-10-27 22:45:00", "2018-10-27 22:45:00", "2018-10-27 22:45:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 02:15:00", "2018-10-28 02:15:00", "2018-10-28 02:15:00"])";
+
+  CheckScalarUnary("floor_temporal", unit, complete_times, unit, complete_times_floor,
+                   &options);
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
+  CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
+}
+
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalNonexistent1) {
+  // Asia/Tehran switches from UTC+3:30 to UTC+4:30 on 2022-03-22 00:00:00 UTC+3:30
+  // This causes an hour long non-existing period in local time.
+  auto unit = timestamp(TimeUnit::SECOND, "Asia/Tehran");
+  auto options = RoundTemporalOptions(16, CalendarUnit::MINUTE);
+  const char* times = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:00:00", "2022-03-21 20:30:00"])";
+  const char* times_ceil = R"([
+    "2022-03-21 19:42:00", "2022-03-21 20:14:00", "2022-03-21 20:34:00"])";
+  const char* times_floor = R"([
+    "2022-03-21 19:26:00", "2022-03-21 19:58:00", "2022-03-21 20:28:00"])";

Review Comment:
   Fixed using [Joris' suggestion](https://github.com/apache/arrow/pull/12528#issuecomment-1125068996). [Diff here.](https://github.com/apache/arrow/pull/12528/commits/39b108c604aac1785156cfb2947a70b549e109b5)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1129523248

   @pitrou I've [added Python tests](https://github.com/apache/arrow/pull/12528/commits/b60c1ce0c0f5e9619fb3cc1bc4918404d4b5f6fb) and floor and ceil seem ok. Round needs to be fixed in c++.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r865042558


##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,107 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(1, CalendarUnit::HOUR, /*some_parameter=*/true);
+  // Asia/Tehran switched from UTC+X to UTC+Y on 2022-03-31 HH:mm:ss

Review Comment:
   So far I have something like this:
   ```
     // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
     // This causes an hour long ambiguous period in local time.
   
     // Asia/Tehran switches from UTC+3:30 to UTC+4:30 on 2022-03-22 00:00:00 UTC+3:30
     // This causes an hour long non-existing period in local time.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r865137124


##########
cpp/src/arrow/compute/kernels/temporal_internal.h:
##########
@@ -119,19 +152,93 @@ struct ZonedLocalizer {
     return tz->to_local(sys_time<Duration>(Duration{t}));
   }
 
-  template <typename Duration>
-  Duration ConvertLocalToSys(Duration t, Status* st) const {
+  template <typename Duration, typename Unit>
+  Duration FloorTimePoint(const int64_t t, const RoundTemporalOptions* options) const {
+    local_time<Duration> lt = tz->to_local(sys_time<Duration>(Duration{t}));
+    const Unit d = floor<Unit>(lt).time_since_epoch();
+    Unit d2;
+
+    if (options->multiple == 1) {
+      d2 = d;
+    } else {
+      const Unit unit = Unit{options->multiple};
+      d2 = (d.count() >= 0) ? d / unit * unit : (d - unit + Unit{1}) / unit * unit;
+    }
+
     try {
-      return zoned_time<Duration>{tz, local_time<Duration>(t)}
+      return zoned_time<Duration>(tz, local_time<Duration>(duration_cast<Duration>(d2)))
           .get_sys_time()
           .time_since_epoch();
-    } catch (const arrow_vendored::date::nonexistent_local_time& e) {
-      *st = Status::Invalid("Local time does not exist: ", e.what());
-      return Duration{0};
-    } catch (const arrow_vendored::date::ambiguous_local_time& e) {
-      *st = Status::Invalid("Local time is ambiguous: ", e.what());
-      return Duration{0};
+    } catch (const arrow_vendored::date::ambiguous_local_time&) {
+      // In case we hit an ambiguous period we round to a time multiple just prior,
+      // convert to UTC and add the time unit we're rounding to.
+      const arrow_vendored::date::local_info li =
+          tz->get_info(local_time<Duration>(duration_cast<Duration>(d2)));
+
+      const Duration t2 =
+          zoned_time<Duration>(
+              tz, local_time<Duration>(duration_cast<Duration>(d2 - li.second.offset)))
+              .get_sys_time()
+              .time_since_epoch();
+      const Unit unit = Unit{options->multiple};
+      const Unit t3 =
+          (t2.count() >= 0) ? t2 / unit * unit : (t2 - unit + Unit{1}) / unit * unit;
+      const Duration t4 = duration_cast<Duration>(t3 + li.first.offset);
+      if (t < t4.count()) {
+        return duration_cast<Duration>(t3 + li.second.offset);
+      }
+      return duration_cast<Duration>(t4);
+    } catch (const arrow_vendored::date::nonexistent_local_time&) {
+      // In case we hit a nonexistent period we calculate the duration between the
+      // start of nonexistent period and rounded to moment in UTC (nonexistent_offset).
+      // We then floor the beginning of the nonexisting period in local time and add
+      // nonexistent_offset to that time point in UTC.
+      const arrow_vendored::date::local_info li =
+          tz->get_info(local_time<Duration>(duration_cast<Duration>(d2)));
+
+      const Duration t2 =
+          zoned_time<Duration>(tz, local_time<Duration>(duration_cast<Duration>(d2)),
+                               arrow_vendored::date::choose::earliest)
+              .get_sys_time()
+              .time_since_epoch();
+      const Duration nonexistent_offset = duration_cast<Duration>(t2 - d2);
+
+      const Unit unit = Unit{options->multiple};
+      const Unit t3 =
+          (t2.count() >= 0) ? t2 / unit * unit : (t2 - unit + Unit{1}) / unit * unit;
+      return duration_cast<Duration>(t3 + li.second.offset) + nonexistent_offset;
+    }
+  }
+
+  template <typename Duration, typename Unit>
+  Duration CeilTimePoint(const int64_t t, const RoundTemporalOptions* options) const {
+    const Duration d = FloorTimePoint<Duration, Unit>(t, options);
+    if (d.count() == t) {
+      return d;
+    }
+    return FloorTimePoint<Duration, Unit>(
+        t + duration_cast<Duration>(Unit{options->multiple}).count(), options);

Review Comment:
   There definitely is a more efficient way however I think it more logically fits into scope of https://github.com/apache/arrow/pull/13043. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1117661130

   @pitrou if you agree with moving the efficiency out of scope this is ready for another review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] amol- commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by "amol- (via GitHub)" <gi...@apache.org>.
amol- commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1490656393

   Closing because it has been untouched for a while, in case it's still relevant feel free to reopen and move it forward 👍


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1131390558

   @jorisvandenbossche Isn't lubridate simply implementing "calendar-based origin" while Pandas isn't?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r871455170


##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,260 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(25, CalendarUnit::MINUTE);
+  const char* times = R"([
+    "2022-09-21 18:09:00", "2022-09-21 18:10:00", "2022-09-21 18:11:00",
+    "2022-09-21 18:19:00", "2022-09-21 18:20:00", "2022-09-21 18:21:00",
+    "2022-09-21 18:44:00", "2022-09-21 18:45:00", "2022-09-21 18:46:00",
+    "2022-09-21 19:09:00", "2022-09-21 19:10:00", "2022-09-21 19:11:00",
+    "2022-09-21 19:24:00", "2022-09-21 19:25:00", "2022-09-21 19:26:00",
+    "2022-09-21 19:34:00", "2022-09-21 19:35:00", "2022-09-21 19:36:00",
+    "2022-09-21 19:59:00", "2022-09-21 20:00:00", "2022-09-21 20:01:00",
+    "2022-09-21 20:24:00", "2022-09-21 20:25:00", "2022-09-21 20:26:00",
+    "2022-09-21 20:49:00", "2022-09-21 20:50:00", "2022-09-21 20:51:00"])";
+  const char* times_ceil = R"([
+    "2022-09-21 18:10:00", "2022-09-21 18:10:00", "2022-09-21 18:20:00",

Review Comment:
   So, if I take `18:10`, it translates to `22:40` in local time, which is not a multiple of 25 minutes... am I misunderstanding something?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r873711335


##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,309 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(25, CalendarUnit::MINUTE);
+  const char* times = R"([
+    "2022-09-21 18:09:00", "2022-09-21 18:10:00", "2022-09-21 18:11:00",
+    "2022-09-21 18:19:00", "2022-09-21 18:20:00", "2022-09-21 18:21:00",
+    "2022-09-21 18:44:00", "2022-09-21 18:45:00", "2022-09-21 18:46:00",
+    "2022-09-21 19:09:00", "2022-09-21 19:10:00", "2022-09-21 19:11:00",
+    "2022-09-21 19:24:00", "2022-09-21 19:25:00", "2022-09-21 19:26:00",
+    "2022-09-21 19:34:00", "2022-09-21 19:35:00", "2022-09-21 19:36:00",
+    "2022-09-21 19:59:00", "2022-09-21 20:00:00", "2022-09-21 20:01:00",
+    "2022-09-21 20:24:00", "2022-09-21 20:25:00", "2022-09-21 20:26:00",
+    "2022-09-21 20:49:00", "2022-09-21 20:50:00", "2022-09-21 20:51:00"])";
+  const char* times_ceil = R"([
+    "2022-09-21 18:10:00", "2022-09-21 18:10:00", "2022-09-21 18:20:00",
+    "2022-09-21 18:20:00", "2022-09-21 18:20:00", "2022-09-21 18:45:00",
+    "2022-09-21 18:45:00", "2022-09-21 18:45:00", "2022-09-21 19:10:00",
+    "2022-09-21 19:10:00", "2022-09-21 19:10:00", "2022-09-21 19:35:00",
+    "2022-09-21 19:35:00", "2022-09-21 19:35:00", "2022-09-21 19:35:00",
+    "2022-09-21 19:35:00", "2022-09-21 19:35:00", "2022-09-21 20:00:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:00:00", "2022-09-21 20:25:00",
+    "2022-09-21 20:25:00", "2022-09-21 20:25:00", "2022-09-21 20:50:00",
+    "2022-09-21 20:50:00", "2022-09-21 20:50:00", "2022-09-21 21:15:00"])";
+  const char* times_floor = R"([
+    "2022-09-21 17:45:00", "2022-09-21 18:10:00", "2022-09-21 18:10:00",
+    "2022-09-21 18:10:00", "2022-09-21 18:10:00", "2022-09-21 18:10:00",
+    "2022-09-21 18:20:00", "2022-09-21 18:45:00", "2022-09-21 18:45:00",
+    "2022-09-21 18:45:00", "2022-09-21 19:10:00", "2022-09-21 19:10:00",
+    "2022-09-21 19:10:00", "2022-09-21 19:10:00", "2022-09-21 19:10:00",
+    "2022-09-21 19:10:00", "2022-09-21 19:35:00", "2022-09-21 19:35:00",
+    "2022-09-21 19:35:00", "2022-09-21 20:00:00", "2022-09-21 20:00:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:25:00", "2022-09-21 20:25:00",
+    "2022-09-21 20:25:00", "2022-09-21 20:50:00", "2022-09-21 20:50:00"])";
+  const char* times_round = R"([
+    "2022-09-21 18:10:00", "2022-09-21 18:10:00", "2022-09-21 18:10:00",
+    "2022-09-21 18:20:00", "2022-09-21 18:20:00", "2022-09-21 18:10:00",
+    "2022-09-21 18:45:00", "2022-09-21 18:45:00", "2022-09-21 18:45:00",
+    "2022-09-21 19:10:00", "2022-09-21 19:10:00", "2022-09-21 19:10:00",
+    "2022-09-21 19:35:00", "2022-09-21 19:35:00", "2022-09-21 19:35:00",
+    "2022-09-21 19:35:00", "2022-09-21 19:35:00", "2022-09-21 19:35:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:00:00", "2022-09-21 20:00:00",
+    "2022-09-21 20:25:00", "2022-09-21 20:25:00", "2022-09-21 20:25:00",
+    "2022-09-21 20:50:00", "2022-09-21 20:50:00", "2022-09-21 20:50:00"])";
+
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
+  CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
+}
+
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous2) {
+  // Europe/Brussels switches from UTC+2:00 to UTC+1:00 on 2018-10-28 03:00:00 UTC+2:00
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::NANO, "Europe/Brussels");
+  auto options = RoundTemporalOptions(25, CalendarUnit::MINUTE);
+  const char* complete_times = R"([
+    "2018-10-27 23:05:00", "2018-10-27 23:06:00", "2018-10-27 23:07:00", "2018-10-27 23:08:00",
+    "2018-10-27 23:09:00", "2018-10-27 23:10:00", "2018-10-27 23:11:00", "2018-10-27 23:12:00",
+    "2018-10-27 23:13:00", "2018-10-27 23:14:00", "2018-10-27 23:15:00", "2018-10-27 23:16:00",
+    "2018-10-27 23:17:00", "2018-10-27 23:18:00", "2018-10-27 23:19:00", "2018-10-27 23:20:00",
+    "2018-10-27 23:21:00", "2018-10-27 23:22:00", "2018-10-27 23:23:00", "2018-10-27 23:24:00",
+    "2018-10-27 23:25:00", "2018-10-27 23:26:00", "2018-10-27 23:27:00", "2018-10-27 23:28:00",
+    "2018-10-27 23:29:00", "2018-10-27 23:30:00", "2018-10-27 23:31:00", "2018-10-27 23:32:00",
+    "2018-10-27 23:33:00", "2018-10-27 23:34:00", "2018-10-27 23:35:00", "2018-10-27 23:36:00",
+    "2018-10-27 23:37:00", "2018-10-27 23:38:00", "2018-10-27 23:39:00", "2018-10-27 23:40:00",
+    "2018-10-27 23:41:00", "2018-10-27 23:42:00", "2018-10-27 23:43:00", "2018-10-27 23:44:00",
+    "2018-10-27 23:45:00", "2018-10-27 23:46:00", "2018-10-27 23:47:00", "2018-10-27 23:48:00",
+    "2018-10-27 23:49:00", "2018-10-27 23:50:00", "2018-10-27 23:51:00", "2018-10-27 23:52:00",
+    "2018-10-27 23:53:00", "2018-10-27 23:54:00", "2018-10-27 23:55:00", "2018-10-27 23:56:00",
+    "2018-10-27 23:57:00", "2018-10-27 23:58:00", "2018-10-27 23:59:00", "2018-10-28 00:00:00",
+    "2018-10-28 00:01:00", "2018-10-28 00:02:00", "2018-10-28 00:03:00", "2018-10-28 00:04:00",
+    "2018-10-28 00:05:00", "2018-10-28 00:06:00", "2018-10-28 00:07:00", "2018-10-28 00:08:00",
+    "2018-10-28 00:09:00", "2018-10-28 00:10:00", "2018-10-28 00:11:00", "2018-10-28 00:12:00",
+    "2018-10-28 00:13:00", "2018-10-28 00:14:00", "2018-10-28 00:15:00", "2018-10-28 00:16:00",
+    "2018-10-28 00:17:00", "2018-10-28 00:18:00", "2018-10-28 00:19:00", "2018-10-28 00:20:00",
+    "2018-10-28 00:21:00", "2018-10-28 00:22:00", "2018-10-28 00:23:00", "2018-10-28 00:24:00",
+    "2018-10-28 00:25:00", "2018-10-28 00:26:00", "2018-10-28 00:27:00", "2018-10-28 00:28:00",
+    "2018-10-28 00:29:00", "2018-10-28 00:30:00", "2018-10-28 00:31:00", "2018-10-28 00:32:00",
+    "2018-10-28 00:33:00", "2018-10-28 00:34:00", "2018-10-28 00:35:00", "2018-10-28 00:36:00",
+    "2018-10-28 00:37:00", "2018-10-28 00:38:00", "2018-10-28 00:39:00", "2018-10-28 00:40:00",
+    "2018-10-28 00:41:00", "2018-10-28 00:42:00", "2018-10-28 00:43:00", "2018-10-28 00:44:00",
+    "2018-10-28 00:45:00", "2018-10-28 00:46:00", "2018-10-28 00:47:00", "2018-10-28 00:48:00",
+    "2018-10-28 00:49:00", "2018-10-28 00:50:00", "2018-10-28 00:51:00", "2018-10-28 00:52:00",
+    "2018-10-28 00:53:00", "2018-10-28 00:54:00", "2018-10-28 00:55:00", "2018-10-28 00:56:00",
+    "2018-10-28 00:57:00", "2018-10-28 00:58:00", "2018-10-28 00:59:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:01:00", "2018-10-28 01:02:00", "2018-10-28 01:03:00", "2018-10-28 01:04:00",
+    "2018-10-28 01:05:00", "2018-10-28 01:06:00", "2018-10-28 01:07:00", "2018-10-28 01:08:00",
+    "2018-10-28 01:09:00", "2018-10-28 01:10:00", "2018-10-28 01:11:00", "2018-10-28 01:12:00",
+    "2018-10-28 01:13:00", "2018-10-28 01:14:00", "2018-10-28 01:15:00", "2018-10-28 01:16:00",
+    "2018-10-28 01:17:00", "2018-10-28 01:18:00", "2018-10-28 01:19:00", "2018-10-28 01:20:00",
+    "2018-10-28 01:21:00", "2018-10-28 01:22:00", "2018-10-28 01:23:00", "2018-10-28 01:24:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:26:00", "2018-10-28 01:27:00", "2018-10-28 01:28:00",
+    "2018-10-28 01:29:00", "2018-10-28 01:30:00", "2018-10-28 01:31:00", "2018-10-28 01:32:00",
+    "2018-10-28 01:33:00", "2018-10-28 01:34:00", "2018-10-28 01:35:00", "2018-10-28 01:36:00",
+    "2018-10-28 01:37:00", "2018-10-28 01:38:00", "2018-10-28 01:39:00", "2018-10-28 01:40:00",
+    "2018-10-28 01:41:00", "2018-10-28 01:42:00", "2018-10-28 01:43:00", "2018-10-28 01:44:00",
+    "2018-10-28 01:45:00", "2018-10-28 01:46:00", "2018-10-28 01:47:00", "2018-10-28 01:48:00",
+    "2018-10-28 01:49:00", "2018-10-28 01:50:00", "2018-10-28 01:51:00", "2018-10-28 01:52:00",
+    "2018-10-28 01:53:00", "2018-10-28 01:54:00", "2018-10-28 01:55:00", "2018-10-28 01:56:00",
+    "2018-10-28 01:57:00", "2018-10-28 01:58:00", "2018-10-28 01:59:00", "2018-10-28 02:00:00",
+    "2018-10-28 02:01:00", "2018-10-28 02:02:00", "2018-10-28 02:03:00", "2018-10-28 02:04:00",
+    "2018-10-28 02:05:00", "2018-10-28 02:06:00", "2018-10-28 02:07:00", "2018-10-28 02:08:00",
+    "2018-10-28 02:09:00", "2018-10-28 02:10:00", "2018-10-28 02:11:00", "2018-10-28 02:12:00",
+    "2018-10-28 02:13:00", "2018-10-28 02:14:00", "2018-10-28 02:15:00", "2018-10-28 02:16:00"])";
+
+  const char* complete_times_floor = R"([
+    "2018-10-27 22:45:00", "2018-10-27 22:45:00", "2018-10-27 22:45:00", "2018-10-27 22:45:00",
+    "2018-10-27 22:45:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 02:15:00", "2018-10-28 02:15:00"])";
+
+  const char* times = R"([
+    "2018-10-27 22:44:00", "2018-10-27 22:45:00", "2018-10-27 22:46:00",
+    "2018-10-27 23:09:00", "2018-10-27 23:10:00", "2018-10-27 23:11:00",
+    "2018-10-27 23:34:00", "2018-10-27 23:35:00", "2018-10-27 23:36:00",
+    "2018-10-27 23:44:00", "2018-10-27 23:45:00", "2018-10-27 23:46:00",
+    "2018-10-27 23:46:00", "2018-10-28 00:00:00", "2018-10-28 00:09:00",
+    "2018-10-28 00:09:00", "2018-10-28 00:10:00", "2018-10-28 00:11:00",
+    "2018-10-28 00:34:00", "2018-10-28 00:35:00", "2018-10-28 00:36:00",
+    "2018-10-28 00:59:00", "2018-10-28 01:00:00", "2018-10-28 01:01:00",
+    "2018-10-28 01:24:00", "2018-10-28 01:25:00", "2018-10-28 01:26:00",
+    "2018-10-28 01:49:00", "2018-10-28 01:50:00", "2018-10-28 01:51:00",
+    "2018-10-28 02:14:00", "2018-10-28 02:15:00", "2018-10-28 02:16:00"])";
+  const char* times_ceil = R"([
+    "2018-10-27 22:45:00", "2018-10-27 22:45:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 02:15:00",
+    "2018-10-28 02:15:00", "2018-10-28 02:15:00", "2018-10-28 02:40:00"])";
+  const char* times_floor = R"([
+    "2018-10-27 22:20:00", "2018-10-27 22:45:00", "2018-10-27 22:45:00",
+    "2018-10-27 22:45:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:45:00", "2018-10-27 23:45:00",
+    "2018-10-27 23:45:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 00:35:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 01:50:00", "2018-10-28 02:15:00", "2018-10-28 02:15:00"])";
+  const char* times_round = R"([
+    "2018-10-27 22:45:00", "2018-10-27 22:45:00", "2018-10-27 22:45:00",
+    "2018-10-27 23:10:00", "2018-10-27 23:10:00", "2018-10-27 23:10:00",
+    "2018-10-27 23:35:00", "2018-10-27 23:35:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:45:00", "2018-10-27 23:45:00", "2018-10-27 23:35:00",
+    "2018-10-27 23:35:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:10:00", "2018-10-28 00:10:00", "2018-10-28 00:10:00",
+    "2018-10-28 00:35:00", "2018-10-28 00:35:00", "2018-10-28 00:35:00",
+    "2018-10-28 01:00:00", "2018-10-28 01:00:00", "2018-10-28 01:00:00",
+    "2018-10-28 01:25:00", "2018-10-28 01:25:00", "2018-10-28 01:25:00",
+    "2018-10-28 01:50:00", "2018-10-28 01:50:00", "2018-10-28 01:50:00",
+    "2018-10-28 02:15:00", "2018-10-28 02:15:00", "2018-10-28 02:15:00"])";
+
+  CheckScalarUnary("floor_temporal", unit, complete_times, unit, complete_times_floor,
+                   &options);
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
+  CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
+}
+
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalNonexistent1) {
+  // Asia/Tehran switches from UTC+3:30 to UTC+4:30 on 2022-03-22 00:00:00 UTC+3:30
+  // This causes an hour long non-existing period in local time.
+  auto unit = timestamp(TimeUnit::SECOND, "Asia/Tehran");
+  auto options = RoundTemporalOptions(16, CalendarUnit::MINUTE);
+  const char* times = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:00:00", "2022-03-21 20:30:00"])";
+  const char* times_ceil = R"([
+    "2022-03-21 19:42:00", "2022-03-21 20:14:00", "2022-03-21 20:34:00"])";
+  const char* times_floor = R"([
+    "2022-03-21 19:26:00", "2022-03-21 19:58:00", "2022-03-21 20:18:00"])";
+  const char* times_round = R"([
+    "2022-03-21 19:26:00", "2022-03-21 19:58:00", "2022-03-21 20:34:00"])";
+
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
+  CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
+}
+
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalNonexistent2) {
+  // Europe/Brussels switches from UTC+1:00 to UTC+2:00 on 2015-03-29 02:00:00 UTC+1:00
+  // This causes an hour long non-existing period in local time.
+  auto unit = timestamp(TimeUnit::SECOND, "Europe/Brussels");
+  auto options = RoundTemporalOptions(16, CalendarUnit::MINUTE);
+  const char* times =
+      R"(["2015-03-29 00:52:00", "2015-03-29 01:01:00", "2015-03-29 01:05:00",
+          "2015-03-29 01:08:00", "2015-03-29 01:10:00", "2015-03-29 01:12:00"])";
+  const char* times_ceil =
+      R"(["2015-03-29 00:52:00", "2015-03-29 01:12:00", "2015-03-29 01:12:00",
+          "2015-03-29 01:12:00", "2015-03-29 01:12:00", "2015-03-29 01:12:00"])";
+  const char* times_floor =
+      R"(["2015-03-29 00:52:00", "2015-03-29 00:56:00", "2015-03-29 00:56:00",

Review Comment:
   Same problem here: `2015-03-29 00:52:00` and `2015-03-29 00:56:00` cannot be _both_ multiples of 16 minutes, since they are separated by 4 minutes in local time.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1149820331

   > Another example would be creating a histogram of taxi pick-up times in local time.
   
   If you want your histogram to be useful, the buckets have to have equal durations in physical time. Looking at your graph, that would not be the case for the "preserve wall" variants?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1149855616

   > > Another example would be creating a histogram of taxi pick-up times in local time.
   > 
   > If you want your histogram to be useful, the buckets have to have equal durations in physical time. Looking at your graph, that would not be the case for the "preserve wall" variants?
   
   Physically equal buckets would be great but we can't have them given the constraints. I chose to flatten first fold for floor and second for ceil because I felt functions imply that. What we also could do is double-size buckets in wall time to distribute the flattening over the ambiguous period. Either way we don't have a "good excuse" for changing buckets.
   Also user would have to choose "preserve wall" so they would presumably know what they're doing and why.
   
   Again we can kick "preserve wall" out from this scope and make a jira linking to this discussion and see if there's a need to implement this later on.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1117301846

   Hmm, apparently, some comments I did long ago on an outdated version got posted with this PR! Sorry, feel free to disregard them :-)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1117326973

   > > Hmm, apparently, some comments I did long ago on an outdated version got posted with this PR! Sorry, feel free to disregard them :-)
   > 
   > Is it possible to remove the unwanted ones?
   
   Ok, done.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1117325814

   > Hmm, apparently, some comments I did long ago on an outdated version got posted with this PR! Sorry, feel free to disregard them :-)
   
   Is it possible to remove the unwanted ones?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r865039810


##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,107 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(1, CalendarUnit::HOUR, /*some_parameter=*/true);
+  // Asia/Tehran switched from UTC+X to UTC+Y on 2022-03-31 HH:mm:ss

Review Comment:
   The intent was that you fill the actual parameter name, and timezone/timestamps values :-)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r865128182


##########
cpp/src/arrow/compute/kernels/temporal_internal.h:
##########
@@ -105,6 +105,39 @@ struct NonZonedLocalizer {
     return t;
   }
 
+  template <typename Duration, typename Unit>
+  Duration FloorTimePoint(int64_t t, const RoundTemporalOptions* options) const {

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1157503008

   @raulcd 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r867157111


##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,114 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(1, CalendarUnit::HOUR, /*some_parameter=*/true);
+  const char* times = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:00:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:00:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:30:00", "2022-09-21 21:00:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_ceil = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:30:00", "2022-09-21 20:30:00", "2022-09-21 21:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_floor = R"([
+    "2022-03-21 19:30:00", "2022-03-21 19:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 18:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 19:30:00", "2022-09-21 20:30:00", "2022-09-21 20:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_round = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:30:00", "2022-09-21 19:30:00",

Review Comment:
   Done. It actually surfaced a potential problem I need to fxi.



##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,114 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(1, CalendarUnit::HOUR, /*some_parameter=*/true);
+  const char* times = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:00:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:00:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:30:00", "2022-09-21 21:00:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_ceil = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:30:00", "2022-09-21 20:30:00", "2022-09-21 21:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_floor = R"([
+    "2022-03-21 19:30:00", "2022-03-21 19:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 18:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 19:30:00", "2022-09-21 20:30:00", "2022-09-21 20:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_round = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:30:00", "2022-09-21 19:30:00",

Review Comment:
   Done. It actually surfaced a potential problem I need to fix.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1149746730

   > @rok I'm not sure how to take a decision based on these graphs. We care about monotonicity in UTC timestamps, right? The graph don't seem to show a particular problem in that case.
   
   The graphs are to supplement testing as we can cover bigger parameter space and see if results make sense.
   
   The `preserve_wall` option preserves monotonicity in wall time as well by flattening part of ambiguous periods. I suppose we only need monotonicity in UTC so `preserve_wall` can be removed and introduced if asked for. What would you be in favour of?
   
   > TBH, I'm starting to question the entire idea of rounding in local time. What are the use cases?
   
   Use case would be something like when do users form a certain country come to my website. Rounding in UTC would be inconsistent. I believe it is a needed feature.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1149732361

   @rok I'm not sure how to take a decision based on these graphs. We care about monotonicity in UTC timestamps, right? The graph don't seem to show a particular problem in that case.
   
   TBH, I'm starting to question the entire idea of rounding in local time. What are the use cases?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1149784835

   > I think monotonicity in UTC is important for analytics use cases (such as bucketing). Monotonicity in wall time not so much.
   
   Let's say you want to analyse trades in arbitrary intervals. In UTC things would make sense but when you view them in wall time buckets with broken order you could have negative inventory because a sell could come before a buy. But I'm not sure anyone wants this right now and I'm ok dropping it. 
   
   > > Use case would be something like when do users form a certain country come to my website. Rounding in UTC would be inconsistent. I believe it is a needed feature.
   > 
   > I'm not sure I understand precisely what you mean, but is that part of an analytics workload?
   
   Another example would be creating a histogram of taxi pick-up times in local time. I think we had [local rounding discussion here](https://github.com/apache/arrow/pull/11818#issuecomment-999499229). 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jorisvandenbossche commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1131369183

   Lubridate:
   
   ```
   > x <- ymd_hms(c("2015-03-29 01:50:00", "2015-03-29 01:59:59", "2015-03-29 03:00:00", "2015-03-29 03:10:00"), tz = "Europe/Brussels")
   > x
   [1] "2015-03-29 01:50:00 CET"  "2015-03-29 01:59:59 CET" 
   [3] "2015-03-29 03:00:00 CEST" "2015-03-29 03:10:00 CEST"
   > round_date(x, "16 mins")
   [1] "2015-03-29 01:48:00 CET"  "2015-03-29 03:04:00 CEST"
   [3] "2015-03-29 03:00:00 CEST" "2015-03-29 03:16:00 CEST"
   ```
   
   (I haven't yet tried to understand the logic behind those results, but so none of the 4 values match with any of the values that this PR currently returns ...)
   
   In pandas:
   
   ```
   x = pd.Series(pd.to_datetime(["2015-03-29 01:50:00", "2015-03-29 01:59:59", "2015-03-29 03:00:00", "2015-03-29 03:10:00"]).tz_localize("Europe/Brussels"))
   
   In [9]: x
   Out[9]: 
   0   2015-03-29 01:50:00+01:00
   1   2015-03-29 01:59:59+01:00
   2   2015-03-29 03:00:00+02:00
   3   2015-03-29 03:10:00+02:00
   dtype: datetime64[ns, Europe/Brussels]
   
   In [12]: x.dt.round("16min", nonexistent="NaT")
   Out[12]: 
   0   2015-03-29 01:52:00+01:00
   1   2015-03-29 01:52:00+01:00
   2                         NaT
   3   2015-03-29 03:12:00+02:00
   dtype: datetime64[ns, Europe/Brussels]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jorisvandenbossche commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r875957401


##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2854,17 +2854,66 @@ TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalNonexistent2) {
       R"(["2015-03-29 00:52:00", "2015-03-29 01:12:00", "2015-03-29 01:12:00",
           "2015-03-29 01:12:00", "2015-03-29 01:12:00", "2015-03-29 01:12:00"])";
   const char* times_floor =
-      R"(["2015-03-29 00:52:00", "2015-03-29 00:52:00", "2015-03-29 00:52:00",
-          "2015-03-29 00:52:00", "2015-03-29 00:52:00", "2015-03-29 01:12:00"])";
+      R"(["2015-03-29 00:52:00", "2015-03-29 00:56:00", "2015-03-29 00:56:00",
+          "2015-03-29 00:56:00", "2015-03-29 00:56:00", "2015-03-29 01:12:00"])";
   const char* times_round =
-      R"(["2015-03-29 00:52:00", "2015-03-29 01:08:00", "2015-03-29 01:12:00",
+      R"(["2015-03-29 00:52:00", "2015-03-29 00:56:00", "2015-03-29 01:12:00",
           "2015-03-29 01:12:00", "2015-03-29 01:12:00", "2015-03-29 01:12:00"])";
 
   CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
   CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
   CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalDSTJump) {
+  // Europe/Brussels switches from UTC+2:00 to UTC+1:00 on 2015-10-29 03:00:00 UTC+2:00
+  // This causes an hour long ambiguous period in local time.
+  // Europe/Brussels switches from UTC+1:00 to UTC+2:00 on 2018-03-28 02:00:00 UTC+1:00
+  // This causes an hour long non-existing period in local time.
+  auto unit = timestamp(TimeUnit::SECOND, "Europe/Brussels");
+  auto options = RoundTemporalOptions(256, CalendarUnit::MINUTE);
+  const char* times =
+      R"(["2015-03-28 21:31:00", "2015-03-28 23:32:00", "2015-03-28 23:33:00",
+          "2015-03-28 23:53:00", "2015-03-29 01:08:00", "2015-03-29 01:28:00",
+          "2015-03-29 01:32:00", "2015-03-29 01:51:00", "2015-03-29 02:12:00",
+          "2015-03-29 02:44:00", "2015-03-29 02:59:00", "2015-03-29 03:02:00",
+          "2015-03-29 03:08:00", "2015-03-29 03:26:00", "2015-03-29 04:59:00",
+          "2018-10-27 20:44:00", "2018-10-27 21:45:00", "2018-10-27 22:46:00",
+          "2018-10-27 23:09:00", "2018-10-27 23:10:00", "2018-10-27 23:11:00",
+          "2018-10-28 03:14:00", "2018-10-28 04:15:00", "2018-10-28 05:16:00"])";

Review Comment:
   Is it actually needed to have that many dates to cover the different cases you can run into here? Because it does make the test harder to understand



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r851496621


##########
cpp/src/arrow/compute/kernels/temporal_internal.h:
##########
@@ -120,18 +123,59 @@ struct ZonedLocalizer {
   }
 
   template <typename Duration>
-  Duration ConvertLocalToSys(Duration t, Status* st) const {
+  Duration get_local_time(Duration arg) const {
+    return zoned_time<Duration>(tz, local_time<Duration>(arg))
+        .get_sys_time()
+        .time_since_epoch();
+  }
+
+  template <typename Duration>
+  Duration get_local_time(Duration arg, const arrow_vendored::date::choose choose) const {
+    return zoned_time<Duration>(tz, local_time<Duration>(arg), choose)
+        .get_sys_time()
+        .time_since_epoch();
+  }
+
+  template <typename Duration>
+  Duration ConvertLocalToSys(
+      Duration t, Status* st,
+      const AmbiguousTime ambiguous = AmbiguousTime::AMBIGUOUS_RAISE,
+      const NonexistentTime nonexistent_time = NonexistentTime::NONEXISTENT_RAISE) const {
     try {
       return zoned_time<Duration>{tz, local_time<Duration>(t)}
           .get_sys_time()
           .time_since_epoch();
     } catch (const arrow_vendored::date::nonexistent_local_time& e) {
-      *st = Status::Invalid("Local time does not exist: ", e.what());
-      return Duration{0};
+      switch (nonexistent_time) {
+        case NonexistentTime::NONEXISTENT_RAISE: {
+          *st = Status::Invalid("Timestamp doesn't exist in timezone '", tz,
+                                "': ", e.what());
+          return t;
+        }
+        case NonexistentTime::NONEXISTENT_EARLIEST: {
+          return get_local_time<Duration>(t, arrow_vendored::date::choose::latest) -
+                 Duration{1};
+        }
+        case NonexistentTime::NONEXISTENT_LATEST: {
+          return get_local_time<Duration>(t, arrow_vendored::date::choose::latest);
+        }
+      }
     } catch (const arrow_vendored::date::ambiguous_local_time& e) {
-      *st = Status::Invalid("Local time is ambiguous: ", e.what());
-      return Duration{0};
+      switch (ambiguous) {
+        case AmbiguousTime::AMBIGUOUS_RAISE: {
+          *st = Status::Invalid("Timestamp is ambiguous in timezone '", tz,
+                                "': ", e.what());
+          return t;
+        }
+        case AmbiguousTime::AMBIGUOUS_EARLIEST: {
+          return get_local_time<Duration>(t, arrow_vendored::date::choose::earliest);
+        }
+        case AmbiguousTime::AMBIGUOUS_LATEST: {
+          return get_local_time<Duration>(t, arrow_vendored::date::choose::latest);
+        }
+      }
     }
+    return Duration{0};
   }
 

Review Comment:
   ```suggestion
     template <typename Duration, typename Unit>
     Duration FloorTime(const int64_t t, const RoundTemporalOptions* options) const {
       const sys_time<Duration> st = sys_time<Duration>{Duration{t}};
       const std::chrono::seconds offset = tz->get_info(tz->to_local(st)).first.offset;
       const Unit d = floor<Unit>(st - offset).time_since_epoch();
   
       if (options->multiple == 1) {
         return duration_cast<Duration>(d + offset);
       } else {
         const Unit unit = Unit{options->multiple};
         const Unit m = (d.count() >= 0) ? d / unit * unit : (d - unit + Unit{1}) / unit * unit;
         return duration_cast<Duration>(m + offset);
       }
     }
   
     template <typename Duration, typename Unit>
     Duration CeilTime(const int64_t t, const RoundTemporalOptions* options) const {
       const Duration d = FloorTime<Duration, Unit>(t, options);
       if (d.count() < t) {
         return d + duration_cast<Duration>(Unit{options->multiple});
       }
       return d;
     }
   
     template <typename Duration, typename Unit>
     Duration RoundTime(const int64_t t, const RoundTemporalOptions* options) const {
       const Duration f = FloorTime<Duration, Unit>(t, options);
       Duration c = f;
       if (f.count() < t) {
         c += duration_cast<Duration>(Unit{options->multiple});
       }
       return (t - f.count() >= c.count() - t) ? c : f;
     }
   ```



##########
cpp/src/arrow/compute/kernels/temporal_internal.h:
##########
@@ -101,7 +101,10 @@ struct NonZonedLocalizer {
   }
 
   template <typename Duration>
-  Duration ConvertLocalToSys(Duration t, Status* st) const {
+  Duration ConvertLocalToSys(
+      Duration t, Status* st,
+      const AmbiguousTime ambiguous = AmbiguousTime::AMBIGUOUS_RAISE,
+      const NonexistentTime nonexistent_time = NonexistentTime::NONEXISTENT_RAISE) const {
     return t;
   }
 

Review Comment:
   ```suggestion
     template <typename Duration, typename Unit>
     Duration FloorTime(int64_t t, const RoundTemporalOptions* options) const {
       const Unit d = floor<Unit>(sys_time<Duration>(Duration{t})).time_since_epoch();
   
       if (options->multiple == 1) {
         return duration_cast<Duration>(d);
       } else {
         const Unit unit = Unit{options->multiple};
         const Unit m = (d.count() >= 0) ? d / unit * unit : (d - unit + Unit{1}) / unit * unit;
         return duration_cast<Duration>(m);
       }
     }
   
     template <typename Duration, typename Unit>
     Duration CeilTime(int64_t t, const RoundTemporalOptions* options) const {
       const Duration d = FloorTime<Duration, Unit>(t, options);
       if (d.count() < t) {
         return d + duration_cast<Duration>(Unit{options->multiple});
       }
       return d;
     }
   
     template <typename Duration, typename Unit>
     Duration RoundTime(const int64_t t, const RoundTemporalOptions* options) const {
       const Duration f = FloorTime<Duration, Unit>(t, options);
       Duration c = f;
       if (f.count() < t) {
         c += duration_cast<Duration>(Unit{options->multiple});
       }
       return (t - f.count() >= c.count() - t) ? c : f;
     }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1112848819

   @ursabot please benchmark command=cpp-micro --suite-filter=scalar-temporal


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r865137124


##########
cpp/src/arrow/compute/kernels/temporal_internal.h:
##########
@@ -119,19 +152,93 @@ struct ZonedLocalizer {
     return tz->to_local(sys_time<Duration>(Duration{t}));
   }
 
-  template <typename Duration>
-  Duration ConvertLocalToSys(Duration t, Status* st) const {
+  template <typename Duration, typename Unit>
+  Duration FloorTimePoint(const int64_t t, const RoundTemporalOptions* options) const {
+    local_time<Duration> lt = tz->to_local(sys_time<Duration>(Duration{t}));
+    const Unit d = floor<Unit>(lt).time_since_epoch();
+    Unit d2;
+
+    if (options->multiple == 1) {
+      d2 = d;
+    } else {
+      const Unit unit = Unit{options->multiple};
+      d2 = (d.count() >= 0) ? d / unit * unit : (d - unit + Unit{1}) / unit * unit;
+    }
+
     try {
-      return zoned_time<Duration>{tz, local_time<Duration>(t)}
+      return zoned_time<Duration>(tz, local_time<Duration>(duration_cast<Duration>(d2)))
           .get_sys_time()
           .time_since_epoch();
-    } catch (const arrow_vendored::date::nonexistent_local_time& e) {
-      *st = Status::Invalid("Local time does not exist: ", e.what());
-      return Duration{0};
-    } catch (const arrow_vendored::date::ambiguous_local_time& e) {
-      *st = Status::Invalid("Local time is ambiguous: ", e.what());
-      return Duration{0};
+    } catch (const arrow_vendored::date::ambiguous_local_time&) {
+      // In case we hit an ambiguous period we round to a time multiple just prior,
+      // convert to UTC and add the time unit we're rounding to.
+      const arrow_vendored::date::local_info li =
+          tz->get_info(local_time<Duration>(duration_cast<Duration>(d2)));
+
+      const Duration t2 =
+          zoned_time<Duration>(
+              tz, local_time<Duration>(duration_cast<Duration>(d2 - li.second.offset)))
+              .get_sys_time()
+              .time_since_epoch();
+      const Unit unit = Unit{options->multiple};
+      const Unit t3 =
+          (t2.count() >= 0) ? t2 / unit * unit : (t2 - unit + Unit{1}) / unit * unit;
+      const Duration t4 = duration_cast<Duration>(t3 + li.first.offset);
+      if (t < t4.count()) {
+        return duration_cast<Duration>(t3 + li.second.offset);
+      }
+      return duration_cast<Duration>(t4);
+    } catch (const arrow_vendored::date::nonexistent_local_time&) {
+      // In case we hit a nonexistent period we calculate the duration between the
+      // start of nonexistent period and rounded to moment in UTC (nonexistent_offset).
+      // We then floor the beginning of the nonexisting period in local time and add
+      // nonexistent_offset to that time point in UTC.
+      const arrow_vendored::date::local_info li =
+          tz->get_info(local_time<Duration>(duration_cast<Duration>(d2)));
+
+      const Duration t2 =
+          zoned_time<Duration>(tz, local_time<Duration>(duration_cast<Duration>(d2)),
+                               arrow_vendored::date::choose::earliest)
+              .get_sys_time()
+              .time_since_epoch();
+      const Duration nonexistent_offset = duration_cast<Duration>(t2 - d2);
+
+      const Unit unit = Unit{options->multiple};
+      const Unit t3 =
+          (t2.count() >= 0) ? t2 / unit * unit : (t2 - unit + Unit{1}) / unit * unit;
+      return duration_cast<Duration>(t3 + li.second.offset) + nonexistent_offset;
+    }
+  }
+
+  template <typename Duration, typename Unit>
+  Duration CeilTimePoint(const int64_t t, const RoundTemporalOptions* options) const {
+    const Duration d = FloorTimePoint<Duration, Unit>(t, options);
+    if (d.count() == t) {
+      return d;
+    }
+    return FloorTimePoint<Duration, Unit>(
+        t + duration_cast<Duration>(Unit{options->multiple}).count(), options);

Review Comment:
   There definitely is a more efficient way however this PR doesn't change the current situation and I optimisation more logically fits into scope of https://github.com/apache/arrow/pull/13043. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r820888807


##########
cpp/src/arrow/compute/api_scalar.h:
##########
@@ -104,10 +104,28 @@ enum class CalendarUnit : int8_t {
   YEAR
 };
 
+/// \brief How to interpret ambiguous local times that can be interpreted as
+/// multiple instants (normally two) due to DST shifts.
+///
+/// AMBIGUOUS_EARLIEST emits the earliest instant amongst possible interpretations.
+/// AMBIGUOUS_LATEST emits the latest instant amongst possible interpretations.
+enum AmbiguousTime { AMBIGUOUS_RAISE, AMBIGUOUS_EARLIEST, AMBIGUOUS_LATEST };
+
+/// \brief How to handle local times that do not exist due to DST shifts.
+///
+/// NONEXISTENT_EARLIEST emits the instant "just before" the DST shift instant
+/// in the given timestamp precision (for example, for a nanoseconds precision
+/// timestamp, this is one nanosecond before the DST shift instant).
+/// NONEXISTENT_LATEST emits the DST shift instant.
+enum NonexistentTime { NONEXISTENT_RAISE, NONEXISTENT_EARLIEST, NONEXISTENT_LATEST };

Review Comment:
   Same here.



##########
cpp/src/arrow/compute/kernels/temporal_internal.h:
##########
@@ -105,6 +105,39 @@ struct NonZonedLocalizer {
     return t;
   }
 
+  template <typename Duration, typename Unit>
+  Duration FloorTimePoint(int64_t t, const RoundTemporalOptions* options) const {

Review Comment:
   Nit, but why not pass the options by const-reference? I don't think it's allowed to omit them.



##########
cpp/src/arrow/compute/kernels/temporal_internal.h:
##########
@@ -119,19 +152,93 @@ struct ZonedLocalizer {
     return tz->to_local(sys_time<Duration>(Duration{t}));
   }
 
-  template <typename Duration>
-  Duration ConvertLocalToSys(Duration t, Status* st) const {
+  template <typename Duration, typename Unit>
+  Duration FloorTimePoint(const int64_t t, const RoundTemporalOptions* options) const {
+    local_time<Duration> lt = tz->to_local(sys_time<Duration>(Duration{t}));
+    const Unit d = floor<Unit>(lt).time_since_epoch();
+    Unit d2;
+
+    if (options->multiple == 1) {
+      d2 = d;
+    } else {
+      const Unit unit = Unit{options->multiple};
+      d2 = (d.count() >= 0) ? d / unit * unit : (d - unit + Unit{1}) / unit * unit;
+    }
+
     try {
-      return zoned_time<Duration>{tz, local_time<Duration>(t)}
+      return zoned_time<Duration>(tz, local_time<Duration>(duration_cast<Duration>(d2)))
           .get_sys_time()
           .time_since_epoch();
-    } catch (const arrow_vendored::date::nonexistent_local_time& e) {
-      *st = Status::Invalid("Local time does not exist: ", e.what());
-      return Duration{0};
-    } catch (const arrow_vendored::date::ambiguous_local_time& e) {
-      *st = Status::Invalid("Local time is ambiguous: ", e.what());
-      return Duration{0};
+    } catch (const arrow_vendored::date::ambiguous_local_time&) {
+      // In case we hit an ambiguous period we round to a time multiple just prior,
+      // convert to UTC and add the time unit we're rounding to.
+      const arrow_vendored::date::local_info li =
+          tz->get_info(local_time<Duration>(duration_cast<Duration>(d2)));
+
+      const Duration t2 =
+          zoned_time<Duration>(
+              tz, local_time<Duration>(duration_cast<Duration>(d2 - li.second.offset)))
+              .get_sys_time()
+              .time_since_epoch();
+      const Unit unit = Unit{options->multiple};
+      const Unit t3 =
+          (t2.count() >= 0) ? t2 / unit * unit : (t2 - unit + Unit{1}) / unit * unit;
+      const Duration t4 = duration_cast<Duration>(t3 + li.first.offset);
+      if (t < t4.count()) {
+        return duration_cast<Duration>(t3 + li.second.offset);
+      }
+      return duration_cast<Duration>(t4);
+    } catch (const arrow_vendored::date::nonexistent_local_time&) {
+      // In case we hit a nonexistent period we calculate the duration between the
+      // start of nonexistent period and rounded to moment in UTC (nonexistent_offset).
+      // We then floor the beginning of the nonexisting period in local time and add
+      // nonexistent_offset to that time point in UTC.
+      const arrow_vendored::date::local_info li =
+          tz->get_info(local_time<Duration>(duration_cast<Duration>(d2)));
+
+      const Duration t2 =
+          zoned_time<Duration>(tz, local_time<Duration>(duration_cast<Duration>(d2)),
+                               arrow_vendored::date::choose::earliest)
+              .get_sys_time()
+              .time_since_epoch();
+      const Duration nonexistent_offset = duration_cast<Duration>(t2 - d2);
+
+      const Unit unit = Unit{options->multiple};
+      const Unit t3 =
+          (t2.count() >= 0) ? t2 / unit * unit : (t2 - unit + Unit{1}) / unit * unit;
+      return duration_cast<Duration>(t3 + li.second.offset) + nonexistent_offset;
+    }
+  }
+
+  template <typename Duration, typename Unit>
+  Duration CeilTimePoint(const int64_t t, const RoundTemporalOptions* options) const {
+    const Duration d = FloorTimePoint<Duration, Unit>(t, options);
+    if (d.count() == t) {
+      return d;
+    }
+    return FloorTimePoint<Duration, Unit>(
+        t + duration_cast<Duration>(Unit{options->multiple}).count(), options);
+  }
+
+  template <typename Duration, typename Unit>
+  Duration RoundTimePoint(const int64_t t, const RoundTemporalOptions* options) const {
+    const Duration f = FloorTimePoint<Duration, Unit>(t, options);
+    Duration c;
+    if (f.count() == t) {
+      c = f;
+    } else {
+      c = FloorTimePoint<Duration, Unit>(
+          t + duration_cast<Duration>(Unit{options->multiple}).count(), options);

Review Comment:
   Same question here wrt. calling `floor` twice.



##########
cpp/src/arrow/compute/api_scalar.h:
##########
@@ -104,10 +104,28 @@ enum class CalendarUnit : int8_t {
   YEAR
 };
 
+/// \brief How to interpret ambiguous local times that can be interpreted as
+/// multiple instants (normally two) due to DST shifts.
+///
+/// AMBIGUOUS_EARLIEST emits the earliest instant amongst possible interpretations.
+/// AMBIGUOUS_LATEST emits the latest instant amongst possible interpretations.
+enum AmbiguousTime { AMBIGUOUS_RAISE, AMBIGUOUS_EARLIEST, AMBIGUOUS_LATEST };

Review Comment:
   Use a scoped num instead:
   ```suggestion
   enum class AmbiguousTime : int8_t { RAISE, EARLIEST, LATEST };
   ```



##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2521,6 +2515,130 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilTemporalAmbiguous) {
+  std::string timezone = "CET";
+  const char* times = R"(["2018-10-28 01:20:00"])";
+  const char* times_earliest = R"(["2018-10-28 00:30:00"])";
+  const char* times_latest = R"(["2018-10-28 01:30:00"])";
+
+  auto unit = timestamp(TimeUnit::NANO, timezone);
+
+  auto options_earliest =
+      RoundTemporalOptions(15, CalendarUnit::MINUTE, true, AMBIGUOUS_EARLIEST);
+  auto options_latest =
+      RoundTemporalOptions(15, CalendarUnit::MINUTE, true, AMBIGUOUS_LATEST);
+  auto options_raise =
+      RoundTemporalOptions(15, CalendarUnit::MINUTE, true, AMBIGUOUS_RAISE);
+
+  ASSERT_RAISES(Invalid, CeilTemporal(ArrayFromJSON(unit, times), options_raise));
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_earliest, &options_earliest);
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_latest, &options_latest);
+}
+
+TEST_F(ScalarTemporalTest, TestFloorTemporalAmbiguous) {
+  std::string timezone = "CET";
+  const char* times = R"(["2018-10-28 01:20:00"])";
+  const char* times_earliest = R"(["2018-10-28 00:15:00"])";
+  const char* times_latest = R"(["2018-10-28 01:15:00"])";
+
+  auto unit = timestamp(TimeUnit::NANO, timezone);
+
+  auto options_earliest =
+      RoundTemporalOptions(15, CalendarUnit::MINUTE, true, AMBIGUOUS_EARLIEST);
+  auto options_latest =
+      RoundTemporalOptions(15, CalendarUnit::MINUTE, true, AMBIGUOUS_LATEST);
+  auto options_raise =
+      RoundTemporalOptions(15, CalendarUnit::MINUTE, true, AMBIGUOUS_RAISE);
+
+  ASSERT_RAISES(Invalid, CeilTemporal(ArrayFromJSON(unit, times), options_raise));
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_earliest,
+                   &options_earliest);
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_latest, &options_latest);
+}
+
+TEST_F(ScalarTemporalTest, TestRoundTemporalAmbiguous) {
+  std::string timezone = "CET";
+  const char* times = R"(["2018-10-28 01:20:00"])";
+  const char* times_earliest = R"(["2018-10-28 00:30:00"])";
+  const char* times_latest = R"(["2018-10-28 01:15:00"])";
+
+  auto unit = timestamp(TimeUnit::NANO, timezone);
+
+  auto options_earliest =
+      RoundTemporalOptions(15, CalendarUnit::MINUTE, true, AMBIGUOUS_EARLIEST);
+  auto options_latest =
+      RoundTemporalOptions(15, CalendarUnit::MINUTE, true, AMBIGUOUS_LATEST);
+  auto options_raise =
+      RoundTemporalOptions(15, CalendarUnit::MINUTE, true, AMBIGUOUS_RAISE);
+
+  ASSERT_RAISES(Invalid, CeilTemporal(ArrayFromJSON(unit, times), options_raise));

Review Comment:
   `RoundTemporal` here. Also, see other instances below.



##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,106 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(1, CalendarUnit::HOUR, true);

Review Comment:
   Please spell out what `true` is for here:
   ```suggestion
     auto options = RoundTemporalOptions(1, CalendarUnit::HOUR, /*some_parameter=*/true);
   ```



##########
cpp/src/arrow/compute/kernels/temporal_internal.h:
##########
@@ -119,19 +152,93 @@ struct ZonedLocalizer {
     return tz->to_local(sys_time<Duration>(Duration{t}));
   }
 
-  template <typename Duration>
-  Duration ConvertLocalToSys(Duration t, Status* st) const {
+  template <typename Duration, typename Unit>
+  Duration FloorTimePoint(const int64_t t, const RoundTemporalOptions* options) const {
+    local_time<Duration> lt = tz->to_local(sys_time<Duration>(Duration{t}));
+    const Unit d = floor<Unit>(lt).time_since_epoch();
+    Unit d2;
+
+    if (options->multiple == 1) {
+      d2 = d;
+    } else {
+      const Unit unit = Unit{options->multiple};
+      d2 = (d.count() >= 0) ? d / unit * unit : (d - unit + Unit{1}) / unit * unit;
+    }
+
     try {
-      return zoned_time<Duration>{tz, local_time<Duration>(t)}
+      return zoned_time<Duration>(tz, local_time<Duration>(duration_cast<Duration>(d2)))
           .get_sys_time()
           .time_since_epoch();
-    } catch (const arrow_vendored::date::nonexistent_local_time& e) {
-      *st = Status::Invalid("Local time does not exist: ", e.what());
-      return Duration{0};
-    } catch (const arrow_vendored::date::ambiguous_local_time& e) {
-      *st = Status::Invalid("Local time is ambiguous: ", e.what());
-      return Duration{0};
+    } catch (const arrow_vendored::date::ambiguous_local_time&) {
+      // In case we hit an ambiguous period we round to a time multiple just prior,
+      // convert to UTC and add the time unit we're rounding to.
+      const arrow_vendored::date::local_info li =
+          tz->get_info(local_time<Duration>(duration_cast<Duration>(d2)));
+
+      const Duration t2 =
+          zoned_time<Duration>(
+              tz, local_time<Duration>(duration_cast<Duration>(d2 - li.second.offset)))
+              .get_sys_time()
+              .time_since_epoch();
+      const Unit unit = Unit{options->multiple};
+      const Unit t3 =
+          (t2.count() >= 0) ? t2 / unit * unit : (t2 - unit + Unit{1}) / unit * unit;
+      const Duration t4 = duration_cast<Duration>(t3 + li.first.offset);
+      if (t < t4.count()) {
+        return duration_cast<Duration>(t3 + li.second.offset);
+      }
+      return duration_cast<Duration>(t4);
+    } catch (const arrow_vendored::date::nonexistent_local_time&) {
+      // In case we hit a nonexistent period we calculate the duration between the
+      // start of nonexistent period and rounded to moment in UTC (nonexistent_offset).
+      // We then floor the beginning of the nonexisting period in local time and add
+      // nonexistent_offset to that time point in UTC.
+      const arrow_vendored::date::local_info li =
+          tz->get_info(local_time<Duration>(duration_cast<Duration>(d2)));
+
+      const Duration t2 =
+          zoned_time<Duration>(tz, local_time<Duration>(duration_cast<Duration>(d2)),
+                               arrow_vendored::date::choose::earliest)
+              .get_sys_time()
+              .time_since_epoch();
+      const Duration nonexistent_offset = duration_cast<Duration>(t2 - d2);
+
+      const Unit unit = Unit{options->multiple};
+      const Unit t3 =
+          (t2.count() >= 0) ? t2 / unit * unit : (t2 - unit + Unit{1}) / unit * unit;
+      return duration_cast<Duration>(t3 + li.second.offset) + nonexistent_offset;
+    }
+  }
+
+  template <typename Duration, typename Unit>
+  Duration CeilTimePoint(const int64_t t, const RoundTemporalOptions* options) const {
+    const Duration d = FloorTimePoint<Duration, Unit>(t, options);
+    if (d.count() == t) {
+      return d;
+    }
+    return FloorTimePoint<Duration, Unit>(
+        t + duration_cast<Duration>(Unit{options->multiple}).count(), options);

Review Comment:
   It's a bit unexpected to call `floor` twice in the `ceil` implementation, is there perhaps a more efficient way to do this?



##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2521,6 +2515,130 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilTemporalAmbiguous) {
+  std::string timezone = "CET";
+  const char* times = R"(["2018-10-28 01:20:00"])";
+  const char* times_earliest = R"(["2018-10-28 00:30:00"])";
+  const char* times_latest = R"(["2018-10-28 01:30:00"])";
+
+  auto unit = timestamp(TimeUnit::NANO, timezone);
+
+  auto options_earliest =
+      RoundTemporalOptions(15, CalendarUnit::MINUTE, true, AMBIGUOUS_EARLIEST);
+  auto options_latest =
+      RoundTemporalOptions(15, CalendarUnit::MINUTE, true, AMBIGUOUS_LATEST);
+  auto options_raise =
+      RoundTemporalOptions(15, CalendarUnit::MINUTE, true, AMBIGUOUS_RAISE);
+
+  ASSERT_RAISES(Invalid, CeilTemporal(ArrayFromJSON(unit, times), options_raise));
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_earliest, &options_earliest);
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_latest, &options_latest);
+}
+
+TEST_F(ScalarTemporalTest, TestFloorTemporalAmbiguous) {
+  std::string timezone = "CET";
+  const char* times = R"(["2018-10-28 01:20:00"])";
+  const char* times_earliest = R"(["2018-10-28 00:15:00"])";
+  const char* times_latest = R"(["2018-10-28 01:15:00"])";
+
+  auto unit = timestamp(TimeUnit::NANO, timezone);
+
+  auto options_earliest =
+      RoundTemporalOptions(15, CalendarUnit::MINUTE, true, AMBIGUOUS_EARLIEST);
+  auto options_latest =
+      RoundTemporalOptions(15, CalendarUnit::MINUTE, true, AMBIGUOUS_LATEST);
+  auto options_raise =
+      RoundTemporalOptions(15, CalendarUnit::MINUTE, true, AMBIGUOUS_RAISE);
+
+  ASSERT_RAISES(Invalid, CeilTemporal(ArrayFromJSON(unit, times), options_raise));

Review Comment:
   ```suggestion
     ASSERT_RAISES(Invalid, FloorTemporal(ArrayFromJSON(unit, times), options_raise));
   ```



##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,106 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(1, CalendarUnit::HOUR, true);
+  const char* times = R"([

Review Comment:
   It's a bit late to suggest this, but I think we should take the habit of making tests more easily understood by spelling out the context, e.g.:
   ```suggestion
     // Asia/Tehran switched from UTC+X to UTC+Y on 2022-03-31 HH:mm:ss
     const char* times = R"([
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jorisvandenbossche commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1131939918

   It is "interesting"(?) to note that the R `clock` package has a variety of options to deal with nonexistent/ambiguous cases while rounding (https://clock.r-lib.org/reference/posixt-rounding.html), but actually none of them match the option 2 from my previous comment ... (AFAIU the `nonexistent="roll-forward"` would match the option 1 from above)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r867157279


##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,114 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(1, CalendarUnit::HOUR, /*some_parameter=*/true);
+  const char* times = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:00:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:00:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:30:00", "2022-09-21 21:00:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_ceil = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:30:00", "2022-09-21 20:30:00", "2022-09-21 21:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_floor = R"([
+    "2022-03-21 19:30:00", "2022-03-21 19:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 18:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 19:30:00", "2022-09-21 20:30:00", "2022-09-21 20:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_round = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:30:00", "2022-09-21 20:30:00", "2022-09-21 21:30:00",
+    "2022-09-21 21:30:00"])";
+
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
+  CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
+}
+
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous2) {
+  // Europe/Brussels switches from UTC+2:00 to UTC+1:00 on 2022-10-25 03:00:00 UTC+2:00
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::NANO, "Europe/Brussels");
+  auto options = RoundTemporalOptions(15, CalendarUnit::MINUTE, true);
+  const char* times = R"([
+    "2018-10-28 01:05:00", "2018-10-28 01:20:00", "2018-10-28 01:55:00",
+    "2018-10-28 01:59:59", "2018-10-28 02:00:00", "2018-10-28 02:08:00"])";
+  const char* times_ceil = R"([
+    "2018-10-28 01:15:00", "2018-10-28 01:30:00", "2018-10-28 02:00:00",
+    "2018-10-28 02:00:00", "2018-10-28 02:00:00", "2018-10-28 02:15:00"])";
+  const char* times_floor = R"([
+    "2018-10-28 01:00:00", "2018-10-28 01:15:00", "2018-10-28 01:45:00",
+    "2018-10-28 01:45:00", "2018-10-28 02:00:00", "2018-10-28 02:00:00"])";
+  const char* times_round = R"([
+    "2018-10-28 01:00:00", "2018-10-28 01:15:00", "2018-10-28 02:00:00",
+    "2018-10-28 02:00:00", "2018-10-28 02:00:00", "2018-10-28 02:15:00"])";
+
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
+  CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
+}
+
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalNonexistent1) {
+  // Asia/Tehran switches from UTC+3:30 to UTC+4:30 on 2022-03-22 00:00:00 UTC+3:30
+  // This causes an hour long non-existing period in local time.
+  auto unit = timestamp(TimeUnit::SECOND, "Asia/Tehran");
+  auto options = RoundTemporalOptions(1, CalendarUnit::HOUR, true);
+  const char* times = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:00:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:00:00", "2022-09-21 19:30:00",

Review Comment:
   Removed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on a diff in pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on code in PR #12528:
URL: https://github.com/apache/arrow/pull/12528#discussion_r867156433


##########
cpp/src/arrow/compute/kernels/scalar_temporal_test.cc:
##########
@@ -2611,6 +2611,114 @@ TEST_F(ScalarTemporalTest, TestRoundTemporal) {
   CheckScalarUnary(op, unit, times, unit, round_15_years, &round_to_15_years);
 }
 
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous1) {
+  // Asia/Tehran switches from UTC+4:30 to UTC+3:30 on 2022-09-22 00:00:00 UTC+4:30.
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::MILLI, "Asia/Tehran");
+  auto options = RoundTemporalOptions(1, CalendarUnit::HOUR, /*some_parameter=*/true);
+  const char* times = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:00:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:00:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:00:00", "2022-09-21 20:30:00", "2022-09-21 21:00:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_ceil = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:30:00", "2022-09-21 20:30:00", "2022-09-21 21:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_floor = R"([
+    "2022-03-21 19:30:00", "2022-03-21 19:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 18:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 19:30:00", "2022-09-21 20:30:00", "2022-09-21 20:30:00",
+    "2022-09-21 21:30:00"])";
+  const char* times_round = R"([
+    "2022-03-21 19:30:00", "2022-03-21 20:30:00", "2022-03-21 20:30:00",
+    "2022-09-21 18:30:00", "2022-09-21 19:30:00", "2022-09-21 19:30:00",
+    "2022-09-21 20:30:00", "2022-09-21 20:30:00", "2022-09-21 21:30:00",
+    "2022-09-21 21:30:00"])";
+
+  CheckScalarUnary("ceil_temporal", unit, times, unit, times_ceil, &options);
+  CheckScalarUnary("floor_temporal", unit, times, unit, times_floor, &options);
+  CheckScalarUnary("round_temporal", unit, times, unit, times_round, &options);
+}
+
+TEST_F(ScalarTemporalTest, TestCeilFloorRoundTemporalAmbiguous2) {
+  // Europe/Brussels switches from UTC+2:00 to UTC+1:00 on 2022-10-25 03:00:00 UTC+2:00
+  // This causes an hour long ambiguous period in local time.
+  auto unit = timestamp(TimeUnit::NANO, "Europe/Brussels");
+  auto options = RoundTemporalOptions(15, CalendarUnit::MINUTE, true);

Review Comment:
   Removed the parameter as it's not relevant for this test.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1124959551

   I don't think that is correct, because 3h08 is not a multiple of 16 minutes:
   ```
   >>> divmod(3*60+8, 16)
   (11, 12)
   ```
   2h08 is a multiple of 16 minutes, but it's also a non-existent local time:
   ```
   >>> divmod(2*60+8, 16)
   (8, 0)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rok commented on pull request #12528: ARROW-15251: [C++] Temporal floor/ceil/round handle ambiguous/nonexistent local time

Posted by GitBox <gi...@apache.org>.
rok commented on PR #12528:
URL: https://github.com/apache/arrow/pull/12528#issuecomment-1126596112

   @ursabot please benchmark command=cpp-micro --suite-filter=scalar-temporal


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org