You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/17 22:51:09 UTC

[GitHub] [arrow] WillAyd opened a new pull request #11990: ARROW-15032 Add DateStruct Function

WillAyd opened a new pull request #11990:
URL: https://github.com/apache/arrow/pull/11990


   There is still one test failure with the timezone test I need to figure out, but hopefully this is still reviewable in the meantime. I wasn't really sold on `DateStruct` as the name of the compute function


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] WillAyd commented on pull request #11990: ARROW-15032: [C++] Add year_month_day function

Posted by GitBox <gi...@apache.org>.
WillAyd commented on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-1005238460


   No worries at all. It looks like you already changed it though?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] WillAyd commented on pull request #11990: ARROW-15032: [C++] Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
WillAyd commented on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-1002156868


   I don't believe any of the CI failures are related to this PR but let me know if I am overlooking something


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm closed pull request #11990: ARROW-15032: [C++] Add year_month_day function

Posted by GitBox <gi...@apache.org>.
lidavidm closed pull request #11990:
URL: https://github.com/apache/arrow/pull/11990


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on a change in pull request #11990: ARROW-15032: [C++] Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
lidavidm commented on a change in pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#discussion_r775574976



##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal_test.cc
##########
@@ -610,6 +673,7 @@ TEST_F(ScalarTemporalTest, TestNonexistentTimezone) {
     ASSERT_RAISES(Invalid, Year(timestamp_array));
     ASSERT_RAISES(Invalid, Month(timestamp_array));
     ASSERT_RAISES(Invalid, Day(timestamp_array));
+    ASSERT_RAISES(Invalid, YearMonthDay(timestamp_array));

Review comment:
       Ah yes, my bad, I didn't look carefully at the diff context. Thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on pull request #11990: ARROW-15032: [C++] Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
lidavidm commented on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-1002159013


   Right, I think everything is unrelated, but I did kick off the R tests again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #11990: ARROW-15032: [C++] Add year_month_day function

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-1005379396


   Benchmark runs are scheduled for baseline = 627720bcfd3671bbe40c22abb8ac3ef006b8f32e and contender = 49093a13bc97a172d1a7071d7d58df0241b3bc8d. 49093a13bc97a172d1a7071d7d58df0241b3bc8d is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/5d6deb3a0dd14fbaacb9be5502dc337a...145de067b4f84c6e9243ce0d61e23e42/)
   [Failed :arrow_down:0.0% :arrow_up:0.0%] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/60db97536e884c8499281e91112ad3cc...c39d0b9918dd4bbdb70f73ad4a7f71d2/)
   [Finished :arrow_down:0.79% :arrow_up:0.04%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/cdfb136fa2914b539d7968d4d24909fc...9dfb8595d74b41efa63feff8e0ff4ae8/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on pull request #11990: ARROW-15032: [C++] Add year_month_day function

Posted by GitBox <gi...@apache.org>.
lidavidm commented on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-1005261791


   Ah yes, that's what I meant. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #11990: ARROW-15032: [C++] Add year_month_day function

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-1005379396


   Benchmark runs are scheduled for baseline = 627720bcfd3671bbe40c22abb8ac3ef006b8f32e and contender = 49093a13bc97a172d1a7071d7d58df0241b3bc8d. 49093a13bc97a172d1a7071d7d58df0241b3bc8d is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/5d6deb3a0dd14fbaacb9be5502dc337a...145de067b4f84c6e9243ce0d61e23e42/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/60db97536e884c8499281e91112ad3cc...c39d0b9918dd4bbdb70f73ad4a7f71d2/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/cdfb136fa2914b539d7968d4d24909fc...9dfb8595d74b41efa63feff8e0ff4ae8/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on a change in pull request #11990: ARROW-15032 Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
rok commented on a change in pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#discussion_r771830498



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -1024,6 +1024,18 @@ Result<Datum> Month(const Datum& values, ExecContext* ctx = NULLPTR);
 ARROW_EXPORT
 Result<Datum> Day(const Datum& values, ExecContext* ctx = NULLPTR);
 
+/// \brief DateStruct returns a struct containing the Year, Month and Day value for
+/// each element of `values`.
+///
+/// \param[in] values input to extract Year/Month/Day from

Review comment:
       ```suggestion
   /// \param[in] values input to extract (year, month, day) struct from
   ```

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc
##########
@@ -185,6 +191,98 @@ struct Day {
   Localizer localizer_;
 };
 
+// ----------------------------------------------------------------------
+// Extract year/month/day from temporal types

Review comment:
       ```suggestion
   // Extract (year, month, day) struct from temporal types
   ```

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc
##########
@@ -833,6 +931,13 @@ const FunctionDoc day_doc{
      "cannot be found in the timezone database."),
     {"values"}};
 
+const FunctionDoc date_struct_doc{
+  "Extract year/month/day",

Review comment:
       ```suggestion
     "Extract (year, month, day) struct",
   ```

##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -1024,6 +1024,18 @@ Result<Datum> Month(const Datum& values, ExecContext* ctx = NULLPTR);
 ARROW_EXPORT
 Result<Datum> Day(const Datum& values, ExecContext* ctx = NULLPTR);
 
+/// \brief DateStruct returns a struct containing the Year, Month and Day value for
+/// each element of `values`.
+///
+/// \param[in] values input to extract Year/Month/Day from
+/// \param[in] ctx the function execution context, optional
+/// \return the resulting datum
+///
+/// \since 6.1.0

Review comment:
       ```suggestion
   /// \since 7.0.0
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on pull request #11990: ARROW-15032: [C++] Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
rok commented on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-999956415


   Didn't read the new commit yet but I wonder could we share some infrastructure of `ISOCalendar` with `DateStruct`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] WillAyd commented on a change in pull request #11990: ARROW-15032: [C++] Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
WillAyd commented on a change in pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#discussion_r775568843



##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal_test.cc
##########
@@ -610,6 +673,7 @@ TEST_F(ScalarTemporalTest, TestNonexistentTimezone) {
     ASSERT_RAISES(Invalid, Year(timestamp_array));
     ASSERT_RAISES(Invalid, Month(timestamp_array));
     ASSERT_RAISES(Invalid, Day(timestamp_array));
+    ASSERT_RAISES(Invalid, YearMonthDay(timestamp_array));

Review comment:
       I think those are already there in the TestZoned1 and TestZoned2 tests (?)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #11990: ARROW-15032: [C++] Add year_month_day function

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-1005379396


   Benchmark runs are scheduled for baseline = 627720bcfd3671bbe40c22abb8ac3ef006b8f32e and contender = 49093a13bc97a172d1a7071d7d58df0241b3bc8d. 49093a13bc97a172d1a7071d7d58df0241b3bc8d is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/5d6deb3a0dd14fbaacb9be5502dc337a...145de067b4f84c6e9243ce0d61e23e42/)
   [Failed :arrow_down:0.0% :arrow_up:0.0%] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/60db97536e884c8499281e91112ad3cc...c39d0b9918dd4bbdb70f73ad4a7f71d2/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/cdfb136fa2914b539d7968d4d24909fc...9dfb8595d74b41efa63feff8e0ff4ae8/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on pull request #11990: ARROW-15032: [C++] Add year_month_day function

Posted by GitBox <gi...@apache.org>.
lidavidm commented on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-1005238746


   Oh I edited the title - do you mind also updating the description?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] WillAyd commented on a change in pull request #11990: ARROW-15032 Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
WillAyd commented on a change in pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#discussion_r771848861



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -1024,6 +1024,18 @@ Result<Datum> Month(const Datum& values, ExecContext* ctx = NULLPTR);
 ARROW_EXPORT
 Result<Datum> Day(const Datum& values, ExecContext* ctx = NULLPTR);
 
+/// \brief DateStruct returns a struct containing the Year, Month and Day value for
+/// each element of `values`.
+///
+/// \param[in] values input to extract (year, month, day) struct from
+/// \param[in] ctx the function execution context, optional
+/// \return the resulting datum
+///
+/// \since 7.0.0
+/// \note API not yet finalized
+ARROW_EXPORT
+Result<Datum> DateStruct(const Datum& values, ExecContext* ctx = NULLPTR);

Review comment:
       Great idea - love this




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #11990: ARROW-15032 Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-997076200






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] WillAyd commented on pull request #11990: ARROW-15032: [C++] Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
WillAyd commented on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-999877018


   Updated the PR title. Looking at your suggestion, is that different from what we are trying on this PR in its current state? Might be overlooking but I think that's what I've already tried in this permalink
   
   https://github.com/apache/arrow/blob/c59783ab7edc74f16b4e709a581229aed91cf44d/cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc#L198


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot commented on pull request #11990: ARROW-15032: [C++] Add year_month_day function

Posted by GitBox <gi...@apache.org>.
ursabot commented on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-1005379396


   Benchmark runs are scheduled for baseline = 627720bcfd3671bbe40c22abb8ac3ef006b8f32e and contender = 49093a13bc97a172d1a7071d7d58df0241b3bc8d. 49093a13bc97a172d1a7071d7d58df0241b3bc8d is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Scheduled] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/5d6deb3a0dd14fbaacb9be5502dc337a...145de067b4f84c6e9243ce0d61e23e42/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/60db97536e884c8499281e91112ad3cc...c39d0b9918dd4bbdb70f73ad4a7f71d2/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/cdfb136fa2914b539d7968d4d24909fc...9dfb8595d74b41efa63feff8e0ff4ae8/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on a change in pull request #11990: ARROW-15032: [C++] Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
lidavidm commented on a change in pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#discussion_r777466662



##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal_test.cc
##########
@@ -546,6 +591,23 @@ TEST_F(ScalarTemporalTest, TestZoned2) {
     auto unit = timestamp(u, "Australia/Broken_Hill");
     auto month = "[1, 3, 1, 5, 1, 12, 12, 12, 1, 1, 1, 1, 12, 12, 12, 1, null]";
     auto day = "[1, 1, 1, 18, 1, 31, 30, 31, 1, 3, 4, 1, 31, 28, 29, 1, null]";
+    auto year_month_day = ArrayFromJSON(year_month_day_type,
+                                        R"([{"year": 1970, "month": 1, "day": 1},
+                          {"year": 2000, "month": 3, "day": 1},
+                          {"year": 1899, "month": 1, "day": 1},
+                          {"year": 2033, "month": 5, "day": 18},
+                          {"year": 2020, "month": 1, "day": 1},
+                          {"year": 2019, "month": 12, "day": 31},
+                          {"year": 2019, "month": 12, "day": 30},
+                          {"year": 2009, "month": 12, "day": 31},
+                          {"year": 2010, "month": 1, "day": 1},
+                          {"year": 2010, "month": 1, "day": 3},
+                          {"year": 2010, "month": 1, "day": 4},
+                          {"year": 2006, "month": 1, "day": 1},
+                          {"year": 2005, "month": 12, "day": 31},
+                          {"year": 2008, "month": 12, "day": 28},
+                          {"year": 2008, "month": 12, "day": 29},
+                          {"year": 2012, "month": 1, "day": 1}, null])");

Review comment:
       If I'm not mistaken, we added the expected values to this test but not the actual assertion.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] WillAyd commented on a change in pull request #11990: ARROW-15032 Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
WillAyd commented on a change in pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#discussion_r771849033



##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc
##########
@@ -185,6 +191,98 @@ struct Day {
   Localizer localizer_;
 };
 
+// ----------------------------------------------------------------------
+// Extract (year, month, day) struct from temporal types
+
+template <typename Duration, typename InType, typename BuilderType>
+struct DateStructVisitValueFunction {
+  static Result<std::function<Status(typename InType::c_type arg)>> Get(
+      const std::vector<BuilderType*>& field_builders, const ArrayData&,
+      StructBuilder* struct_builder) {
+    return [=](typename InType::c_type arg) {
+      const auto ymd = year_month_day(floor<days>(NonZonedLocalizer{}.template ConvertTimePoint<Duration>(arg)));;
+      field_builders[0]->UnsafeAppend(static_cast<const int32_t>(ymd.year()));
+      field_builders[1]->UnsafeAppend(static_cast<const uint32_t>(ymd.month()));
+      field_builders[2]->UnsafeAppend(static_cast<const uint32_t>(ymd.day()));
+      return struct_builder->Append();
+    };
+  };
+};  
+
+template <typename Duration, typename BuilderType>
+struct DateStructVisitValueFunction<Duration, TimestampType, BuilderType> {
+  static Result<std::function<Status(typename TimestampType::c_type arg)>> Get(
+      const std::vector<BuilderType*>& field_builders, const ArrayData& in,
+      StructBuilder* struct_builder) {
+    const auto& timezone = GetInputTimezone(in);
+    if (timezone.empty()) {
+      return [=](TimestampType::c_type arg) {
+        const auto ymd = year_month_day(floor<days>(NonZonedLocalizer{}.template ConvertTimePoint<Duration>(arg)));;
+        field_builders[0]->UnsafeAppend(static_cast<const int32_t>(ymd.year()));
+        field_builders[1]->UnsafeAppend(static_cast<const uint32_t>(ymd.month()));
+        field_builders[2]->UnsafeAppend(static_cast<const uint32_t>(ymd.day()));
+        return struct_builder->Append();
+      };
+    }
+    ARROW_ASSIGN_OR_RAISE(auto tz, LocateZone(timezone));
+    return [=](TimestampType::c_type arg) {
+        const auto ymd = year_month_day(floor<days>(ZonedLocalizer{tz}.template ConvertTimePoint<Duration>(arg)));      
+        field_builders[0]->UnsafeAppend(static_cast<const int32_t>(ymd.year()));
+        field_builders[1]->UnsafeAppend(static_cast<const uint32_t>(ymd.month()));
+        field_builders[2]->UnsafeAppend(static_cast<const uint32_t>(ymd.day()));
+      return struct_builder->Append();
+    };
+  }
+};
+
+template <typename Duration, typename InType>
+struct DateStruct {
+  static Status Call(KernelContext* ctx, const Scalar& in, Scalar* out) {
+    if (in.is_valid) {
+      const auto& in_val = internal::UnboxScalar<const InType>::Unbox(in);
+      const auto t = floor<days>(NonZonedLocalizer{}.template ConvertTimePoint<Duration>(in_val));

Review comment:
       Yea I was a little wary of how this handled timezones. This was copied over from ISOCalendar, which I'm guessing may have different requirements.
   
   Will give it another pass and see if I can match the other kernels more closely 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on pull request #11990: ARROW-15032: [C++] Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
lidavidm commented on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-999877614


   Yes, I mean to write out the kernel "loop" yourself (as you originally did) instead of using the templated helpers (which you've found don't work).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on pull request #11990: ARROW-15032: [C++] Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
lidavidm commented on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-1005185700


   @WillAyd sorry, I always forget this: do you mind updating the PR description as well? It gets used as part of the commit message


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] WillAyd commented on pull request #11990: ARROW-15032: [C++] Add year_month_day function

Posted by GitBox <gi...@apache.org>.
WillAyd commented on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-1005261491


   I'm guessing the description is pulled from the OP. Just updated


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on pull request #11990: ARROW-15032: [C++] Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
lidavidm commented on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-1005185913


   (Last thing for sure this time :crossed_fingers:)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on pull request #11990: ARROW-15032: [C++] Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
rok commented on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-1005116819


   LGTM!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] WillAyd commented on pull request #11990: ARROW-15032 Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
WillAyd commented on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-999273534


   Updated to make YearMonthDay work more closely to the other Unary funcs. This doesn't build atm because (as far as I can tell) passing a StructArray to the `UnaryTemporalFactory` doesn't resolve to existing specializations. Need to study the factory machinery a little more, but I'm guessing something needs to be changed with the factories themselves to allow this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on pull request #11990: ARROW-15032 Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
lidavidm commented on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-999561189


   > > Updated to make YearMonthDay work more closely to the other Unary funcs. This doesn't build atm because (as far as I can tell) passing a StructArray to the `UnaryTemporalFactory` doesn't resolve to existing specializations. Need to study the factory machinery a little more, but I'm guessing something needs to be changed with the factories themselves to allow this
   > 
   > I think this is also the reason `ISOCalendar` is implemented the way is was.
   
   Apologies, you are right, you cannot implement it exactly the same way as the other unary functions.
   
   I meant something more like this:
   
   ```cpp
   template <typename Duration, typename Localizer>
   struct YearMonthDay {
     explicit YearMonthDay(const FunctionOptions* options, Localizer&& localizer)
         : localizer_(std::move(localizer)) {}
   
     template <typename T, typename Arg0>
     T Call(KernelContext*, Arg0 arg, Status*) const {
       static_assert(std::is_same<T, year_month_day>, "");
       return year_month_day(floor<days>(localizer_.template ConvertTimePoint<Duration>(arg)));
     }
   
     Localizer localizer_;
   };
   
   // ...
   
   // Original PR implementation, but using the YearMonthDay struct to share some of the code between the scalar/array cases
   
   ```
   
   i.e. to just factor out a bit more of the shared code between the scalar/array cases.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on a change in pull request #11990: ARROW-15032 Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
rok commented on a change in pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#discussion_r771858259



##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc
##########
@@ -185,6 +191,98 @@ struct Day {
   Localizer localizer_;
 };
 
+// ----------------------------------------------------------------------
+// Extract (year, month, day) struct from temporal types
+
+template <typename Duration, typename InType, typename BuilderType>
+struct DateStructVisitValueFunction {
+  static Result<std::function<Status(typename InType::c_type arg)>> Get(
+      const std::vector<BuilderType*>& field_builders, const ArrayData&,
+      StructBuilder* struct_builder) {
+    return [=](typename InType::c_type arg) {
+      const auto ymd = year_month_day(floor<days>(NonZonedLocalizer{}.template ConvertTimePoint<Duration>(arg)));;
+      field_builders[0]->UnsafeAppend(static_cast<const int32_t>(ymd.year()));
+      field_builders[1]->UnsafeAppend(static_cast<const uint32_t>(ymd.month()));
+      field_builders[2]->UnsafeAppend(static_cast<const uint32_t>(ymd.day()));
+      return struct_builder->Append();
+    };
+  };
+};  
+
+template <typename Duration, typename BuilderType>
+struct DateStructVisitValueFunction<Duration, TimestampType, BuilderType> {
+  static Result<std::function<Status(typename TimestampType::c_type arg)>> Get(
+      const std::vector<BuilderType*>& field_builders, const ArrayData& in,
+      StructBuilder* struct_builder) {
+    const auto& timezone = GetInputTimezone(in);
+    if (timezone.empty()) {
+      return [=](TimestampType::c_type arg) {
+        const auto ymd = year_month_day(floor<days>(NonZonedLocalizer{}.template ConvertTimePoint<Duration>(arg)));;
+        field_builders[0]->UnsafeAppend(static_cast<const int32_t>(ymd.year()));
+        field_builders[1]->UnsafeAppend(static_cast<const uint32_t>(ymd.month()));
+        field_builders[2]->UnsafeAppend(static_cast<const uint32_t>(ymd.day()));
+        return struct_builder->Append();
+      };
+    }
+    ARROW_ASSIGN_OR_RAISE(auto tz, LocateZone(timezone));
+    return [=](TimestampType::c_type arg) {
+        const auto ymd = year_month_day(floor<days>(ZonedLocalizer{tz}.template ConvertTimePoint<Duration>(arg)));      
+        field_builders[0]->UnsafeAppend(static_cast<const int32_t>(ymd.year()));
+        field_builders[1]->UnsafeAppend(static_cast<const uint32_t>(ymd.month()));
+        field_builders[2]->UnsafeAppend(static_cast<const uint32_t>(ymd.day()));
+      return struct_builder->Append();
+    };
+  }
+};
+
+template <typename Duration, typename InType>
+struct DateStruct {
+  static Status Call(KernelContext* ctx, const Scalar& in, Scalar* out) {
+    if (in.is_valid) {
+      const auto& in_val = internal::UnboxScalar<const InType>::Unbox(in);
+      const auto t = floor<days>(NonZonedLocalizer{}.template ConvertTimePoint<Duration>(in_val));

Review comment:
       Handling timezones would be nice.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on pull request #11990: ARROW-15032 Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
rok commented on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-999515836


   > Updated to make YearMonthDay work more closely to the other Unary funcs. This doesn't build atm because (as far as I can tell) passing a StructArray to the `UnaryTemporalFactory` doesn't resolve to existing specializations. Need to study the factory machinery a little more, but I'm guessing something needs to be changed with the factories themselves to allow this
   
   I think this is also the reason `ISOCalendar` is implemented the way is was. @lidavidm can unary kernels return StructArray?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on a change in pull request #11990: ARROW-15032 Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
lidavidm commented on a change in pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#discussion_r771842555



##########
File path: cpp/src/arrow/compute/api_scalar.h
##########
@@ -1024,6 +1024,18 @@ Result<Datum> Month(const Datum& values, ExecContext* ctx = NULLPTR);
 ARROW_EXPORT
 Result<Datum> Day(const Datum& values, ExecContext* ctx = NULLPTR);
 
+/// \brief DateStruct returns a struct containing the Year, Month and Day value for
+/// each element of `values`.
+///
+/// \param[in] values input to extract (year, month, day) struct from
+/// \param[in] ctx the function execution context, optional
+/// \return the resulting datum
+///
+/// \since 7.0.0
+/// \note API not yet finalized
+ARROW_EXPORT
+Result<Datum> DateStruct(const Datum& values, ExecContext* ctx = NULLPTR);

Review comment:
       `YearMonthDay`/`year_month_day` might be a little more consistent with the naming here?

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal_test.cc
##########
@@ -580,6 +627,7 @@ TEST_F(ScalarTemporalTest, TestZoned2) {
     CheckScalarUnary("year", unit, times_seconds_precision, int64(), year);
     CheckScalarUnary("month", unit, times_seconds_precision, int64(), month);
     CheckScalarUnary("day", unit, times_seconds_precision, int64(), day);
+    // TODO: where is year assigned in this function?

Review comment:
       TEST_F declares the test as using a [test fixture in Googletest](https://google.github.io/googletest/primer.html#same-data-multiple-tests) so it effectively becomes a method of the fixture class above. Hence `year` comes from here: https://github.com/apache/arrow/blob/b5a47e166c7c168c3393256d4da48bf7b7f6add5/cpp/src/arrow/compute/kernels/scalar_temporal_test.cc#L141

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc
##########
@@ -185,6 +191,98 @@ struct Day {
   Localizer localizer_;
 };
 
+// ----------------------------------------------------------------------
+// Extract (year, month, day) struct from temporal types
+
+template <typename Duration, typename InType, typename BuilderType>
+struct DateStructVisitValueFunction {
+  static Result<std::function<Status(typename InType::c_type arg)>> Get(
+      const std::vector<BuilderType*>& field_builders, const ArrayData&,
+      StructBuilder* struct_builder) {
+    return [=](typename InType::c_type arg) {
+      const auto ymd = year_month_day(floor<days>(NonZonedLocalizer{}.template ConvertTimePoint<Duration>(arg)));;
+      field_builders[0]->UnsafeAppend(static_cast<const int32_t>(ymd.year()));
+      field_builders[1]->UnsafeAppend(static_cast<const uint32_t>(ymd.month()));
+      field_builders[2]->UnsafeAppend(static_cast<const uint32_t>(ymd.day()));
+      return struct_builder->Append();
+    };
+  };
+};  
+
+template <typename Duration, typename BuilderType>
+struct DateStructVisitValueFunction<Duration, TimestampType, BuilderType> {
+  static Result<std::function<Status(typename TimestampType::c_type arg)>> Get(
+      const std::vector<BuilderType*>& field_builders, const ArrayData& in,
+      StructBuilder* struct_builder) {
+    const auto& timezone = GetInputTimezone(in);
+    if (timezone.empty()) {
+      return [=](TimestampType::c_type arg) {
+        const auto ymd = year_month_day(floor<days>(NonZonedLocalizer{}.template ConvertTimePoint<Duration>(arg)));;
+        field_builders[0]->UnsafeAppend(static_cast<const int32_t>(ymd.year()));
+        field_builders[1]->UnsafeAppend(static_cast<const uint32_t>(ymd.month()));
+        field_builders[2]->UnsafeAppend(static_cast<const uint32_t>(ymd.day()));
+        return struct_builder->Append();
+      };
+    }
+    ARROW_ASSIGN_OR_RAISE(auto tz, LocateZone(timezone));
+    return [=](TimestampType::c_type arg) {
+        const auto ymd = year_month_day(floor<days>(ZonedLocalizer{tz}.template ConvertTimePoint<Duration>(arg)));      
+        field_builders[0]->UnsafeAppend(static_cast<const int32_t>(ymd.year()));
+        field_builders[1]->UnsafeAppend(static_cast<const uint32_t>(ymd.month()));
+        field_builders[2]->UnsafeAppend(static_cast<const uint32_t>(ymd.day()));
+      return struct_builder->Append();
+    };
+  }
+};
+
+template <typename Duration, typename InType>
+struct DateStruct {
+  static Status Call(KernelContext* ctx, const Scalar& in, Scalar* out) {
+    if (in.is_valid) {
+      const auto& in_val = internal::UnboxScalar<const InType>::Unbox(in);
+      const auto t = floor<days>(NonZonedLocalizer{}.template ConvertTimePoint<Duration>(in_val));

Review comment:
       It might be possible to structure things more like the other kernels except perhaps return `year_month_day`. And then we can dispatch to the right one variant here and reuse the same implementation across the scalar/array and zoned/naive timestamp cases.

##########
File path: docs/source/cpp/compute.rst
##########
@@ -1390,29 +1392,31 @@ For timestamps inputs with non-empty timezone, localized timestamp components wi
 +--------------------+------------+-------------------+---------------+----------------------------+-------+
 | subsecond          | Unary      | Timestamp, Time   | Double        |                            |       |
 +--------------------+------------+-------------------+---------------+----------------------------+-------+
-| us_week            | Unary      | Temporal          | Int64         |                            | \(4)  |
+| us_week            | Unary      | Temporal          | Int64         |                            | \(5)  |
 +--------------------+------------+-------------------+---------------+----------------------------+-------+
-| week               | Unary      | Timestamp         | Int64         | :struct:`WeekOptions`      | \(5)  |
+| week               | Unary      | Timestamp         | Int64         | :struct:`WeekOptions`      | \(6)  |
 +--------------------+------------+-------------------+---------------+----------------------------+-------+
 | year               | Unary      | Temporal          | Int64         |                            |       |
 +--------------------+------------+-------------------+---------------+----------------------------+-------+
 
-* \(1) Outputs the number of the day of the week. By default week begins on Monday
+* \(1) Output is a ``{"year": output type, "month": output type, "day": output type}`` Struct.

Review comment:
       This should be something like `{"year": int64(), ...}` no?

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc
##########
@@ -833,6 +931,13 @@ const FunctionDoc day_doc{
      "cannot be found in the timezone database."),
     {"values"}};
 
+const FunctionDoc date_struct_doc{
+  "Extract (year, month, day) struct",
+  ("Null values emit null.\n"
+   "An error is returned in the values have a defined timezone but it\n"
+   "cannot be found in the timezone database."),

Review comment:
       ```suggestion
      "An error is returned if the values have a defined timezone but it\n"
      "cannot be found in the timezone database."),
   ```

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc
##########
@@ -185,6 +191,98 @@ struct Day {
   Localizer localizer_;
 };
 
+// ----------------------------------------------------------------------
+// Extract (year, month, day) struct from temporal types
+
+template <typename Duration, typename InType, typename BuilderType>
+struct DateStructVisitValueFunction {
+  static Result<std::function<Status(typename InType::c_type arg)>> Get(
+      const std::vector<BuilderType*>& field_builders, const ArrayData&,
+      StructBuilder* struct_builder) {
+    return [=](typename InType::c_type arg) {
+      const auto ymd = year_month_day(floor<days>(NonZonedLocalizer{}.template ConvertTimePoint<Duration>(arg)));;
+      field_builders[0]->UnsafeAppend(static_cast<const int32_t>(ymd.year()));
+      field_builders[1]->UnsafeAppend(static_cast<const uint32_t>(ymd.month()));
+      field_builders[2]->UnsafeAppend(static_cast<const uint32_t>(ymd.day()));
+      return struct_builder->Append();
+    };
+  };
+};  
+
+template <typename Duration, typename BuilderType>
+struct DateStructVisitValueFunction<Duration, TimestampType, BuilderType> {
+  static Result<std::function<Status(typename TimestampType::c_type arg)>> Get(
+      const std::vector<BuilderType*>& field_builders, const ArrayData& in,
+      StructBuilder* struct_builder) {
+    const auto& timezone = GetInputTimezone(in);
+    if (timezone.empty()) {
+      return [=](TimestampType::c_type arg) {
+        const auto ymd = year_month_day(floor<days>(NonZonedLocalizer{}.template ConvertTimePoint<Duration>(arg)));;
+        field_builders[0]->UnsafeAppend(static_cast<const int32_t>(ymd.year()));
+        field_builders[1]->UnsafeAppend(static_cast<const uint32_t>(ymd.month()));
+        field_builders[2]->UnsafeAppend(static_cast<const uint32_t>(ymd.day()));
+        return struct_builder->Append();
+      };
+    }
+    ARROW_ASSIGN_OR_RAISE(auto tz, LocateZone(timezone));
+    return [=](TimestampType::c_type arg) {
+        const auto ymd = year_month_day(floor<days>(ZonedLocalizer{tz}.template ConvertTimePoint<Duration>(arg)));      
+        field_builders[0]->UnsafeAppend(static_cast<const int32_t>(ymd.year()));
+        field_builders[1]->UnsafeAppend(static_cast<const uint32_t>(ymd.month()));
+        field_builders[2]->UnsafeAppend(static_cast<const uint32_t>(ymd.day()));
+      return struct_builder->Append();
+    };
+  }
+};
+
+template <typename Duration, typename InType>
+struct DateStruct {
+  static Status Call(KernelContext* ctx, const Scalar& in, Scalar* out) {
+    if (in.is_valid) {
+      const auto& in_val = internal::UnboxScalar<const InType>::Unbox(in);
+      const auto t = floor<days>(NonZonedLocalizer{}.template ConvertTimePoint<Duration>(in_val));

Review comment:
       What if the scalar type has a timezone?

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc
##########
@@ -185,6 +191,98 @@ struct Day {
   Localizer localizer_;
 };
 
+// ----------------------------------------------------------------------
+// Extract (year, month, day) struct from temporal types
+
+template <typename Duration, typename InType, typename BuilderType>
+struct DateStructVisitValueFunction {
+  static Result<std::function<Status(typename InType::c_type arg)>> Get(
+      const std::vector<BuilderType*>& field_builders, const ArrayData&,
+      StructBuilder* struct_builder) {
+    return [=](typename InType::c_type arg) {
+      const auto ymd = year_month_day(floor<days>(NonZonedLocalizer{}.template ConvertTimePoint<Duration>(arg)));;
+      field_builders[0]->UnsafeAppend(static_cast<const int32_t>(ymd.year()));
+      field_builders[1]->UnsafeAppend(static_cast<const uint32_t>(ymd.month()));
+      field_builders[2]->UnsafeAppend(static_cast<const uint32_t>(ymd.day()));
+      return struct_builder->Append();
+    };
+  };
+};  
+
+template <typename Duration, typename BuilderType>
+struct DateStructVisitValueFunction<Duration, TimestampType, BuilderType> {
+  static Result<std::function<Status(typename TimestampType::c_type arg)>> Get(
+      const std::vector<BuilderType*>& field_builders, const ArrayData& in,
+      StructBuilder* struct_builder) {
+    const auto& timezone = GetInputTimezone(in);
+    if (timezone.empty()) {
+      return [=](TimestampType::c_type arg) {
+        const auto ymd = year_month_day(floor<days>(NonZonedLocalizer{}.template ConvertTimePoint<Duration>(arg)));;
+        field_builders[0]->UnsafeAppend(static_cast<const int32_t>(ymd.year()));
+        field_builders[1]->UnsafeAppend(static_cast<const uint32_t>(ymd.month()));
+        field_builders[2]->UnsafeAppend(static_cast<const uint32_t>(ymd.day()));
+        return struct_builder->Append();
+      };
+    }
+    ARROW_ASSIGN_OR_RAISE(auto tz, LocateZone(timezone));
+    return [=](TimestampType::c_type arg) {
+        const auto ymd = year_month_day(floor<days>(ZonedLocalizer{tz}.template ConvertTimePoint<Duration>(arg)));      
+        field_builders[0]->UnsafeAppend(static_cast<const int32_t>(ymd.year()));
+        field_builders[1]->UnsafeAppend(static_cast<const uint32_t>(ymd.month()));
+        field_builders[2]->UnsafeAppend(static_cast<const uint32_t>(ymd.day()));
+      return struct_builder->Append();
+    };
+  }
+};
+
+template <typename Duration, typename InType>
+struct DateStruct {
+  static Status Call(KernelContext* ctx, const Scalar& in, Scalar* out) {
+    if (in.is_valid) {
+      const auto& in_val = internal::UnboxScalar<const InType>::Unbox(in);
+      const auto t = floor<days>(NonZonedLocalizer{}.template ConvertTimePoint<Duration>(in_val));
+      const auto ymd = year_month_day(t);
+      
+      ScalarVector values = {std::make_shared<Int64Scalar>(static_cast<const int32_t>(ymd.year())),
+	std::make_shared<Int64Scalar>(static_cast<const uint32_t>(ymd.month())),
+	std::make_shared<Int64Scalar>(static_cast<const uint32_t>(ymd.day()))};
+      *checked_cast<StructScalar*>(out) =
+          StructScalar(std::move(values), DateStructType());

Review comment:
       You're going to want to format the code. See https://arrow.apache.org/docs/developers/cpp/development.html#code-style-linting-and-ci




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] WillAyd commented on pull request #11990: ARROW-15032: [C++] Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
WillAyd commented on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-1000615827


   Alright so I have this to a point where ISOCalendar and YearMonthDay are practically mirror images of one another, save different implementations of a `GetX` function. In terms of more code sharing my first thought was to move some of the Wrapper / Visit functions in as struct members to both ISOCalendar and YearMonthDay. I'm brand new to C++ though so if there's a more idiomatic approach you had in mind let me know and happy to work on that


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok edited a comment on pull request #11990: ARROW-15032 Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
rok edited a comment on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-999515836


   > Updated to make YearMonthDay work more closely to the other Unary funcs. This doesn't build atm because (as far as I can tell) passing a StructArray to the `UnaryTemporalFactory` doesn't resolve to existing specializations. Need to study the factory machinery a little more, but I'm guessing something needs to be changed with the factories themselves to allow this
   
   I think this is also the reason `ISOCalendar` is implemented the way is was.
   
   @lidavidm can unary kernels return StructArray? Also, can you approve the CI workflow? I don't have permissions to.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on pull request #11990: ARROW-15032 Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
lidavidm commented on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-999561893


   (Tangentially: do you mind updating the PR title to fit the format? e.g. `ARROW-15032: [C++] Add DateStruct Function`)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on pull request #11990: ARROW-15032: [C++] Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
lidavidm commented on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-999877958


   But we can separate out the logic for the actual calculation more than was done originally (IIRC) instead of hard coding a call to a localizer, etc.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] WillAyd commented on a change in pull request #11990: ARROW-15032: [C++] Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
WillAyd commented on a change in pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#discussion_r778320769



##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal_test.cc
##########
@@ -546,6 +591,23 @@ TEST_F(ScalarTemporalTest, TestZoned2) {
     auto unit = timestamp(u, "Australia/Broken_Hill");
     auto month = "[1, 3, 1, 5, 1, 12, 12, 12, 1, 1, 1, 1, 12, 12, 12, 1, null]";
     auto day = "[1, 1, 1, 18, 1, 31, 30, 31, 1, 3, 4, 1, 31, 28, 29, 1, null]";
+    auto year_month_day = ArrayFromJSON(year_month_day_type,
+                                        R"([{"year": 1970, "month": 1, "day": 1},
+                          {"year": 2000, "month": 3, "day": 1},
+                          {"year": 1899, "month": 1, "day": 1},
+                          {"year": 2033, "month": 5, "day": 18},
+                          {"year": 2020, "month": 1, "day": 1},
+                          {"year": 2019, "month": 12, "day": 31},
+                          {"year": 2019, "month": 12, "day": 30},
+                          {"year": 2009, "month": 12, "day": 31},
+                          {"year": 2010, "month": 1, "day": 1},
+                          {"year": 2010, "month": 1, "day": 3},
+                          {"year": 2010, "month": 1, "day": 4},
+                          {"year": 2006, "month": 1, "day": 1},
+                          {"year": 2005, "month": 12, "day": 31},
+                          {"year": 2008, "month": 12, "day": 28},
+                          {"year": 2008, "month": 12, "day": 29},
+                          {"year": 2012, "month": 1, "day": 1}, null])");

Review comment:
       Thanks for the callout - just added this test




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on pull request #11990: ARROW-15032: [C++] Add year_month_day function

Posted by GitBox <gi...@apache.org>.
lidavidm commented on pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#issuecomment-1005307530


   Merged. Thank you @WillAyd!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on a change in pull request #11990: ARROW-15032: [C++] Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
rok commented on a change in pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#discussion_r777436434



##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal_test.cc
##########
@@ -580,6 +627,7 @@ TEST_F(ScalarTemporalTest, TestZoned2) {
     CheckScalarUnary("year", unit, times_seconds_precision, int64(), year);
     CheckScalarUnary("month", unit, times_seconds_precision, int64(), month);
     CheckScalarUnary("day", unit, times_seconds_precision, int64(), day);
+    // TODO: where is year assigned in this function?

Review comment:
       This TODO comment can probably be removed?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on a change in pull request #11990: ARROW-15032: [C++] Add DateStruct Function

Posted by GitBox <gi...@apache.org>.
rok commented on a change in pull request #11990:
URL: https://github.com/apache/arrow/pull/11990#discussion_r777474147



##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal_test.cc
##########
@@ -546,6 +591,23 @@ TEST_F(ScalarTemporalTest, TestZoned2) {
     auto unit = timestamp(u, "Australia/Broken_Hill");
     auto month = "[1, 3, 1, 5, 1, 12, 12, 12, 1, 1, 1, 1, 12, 12, 12, 1, null]";
     auto day = "[1, 1, 1, 18, 1, 31, 30, 31, 1, 3, 4, 1, 31, 28, 29, 1, null]";
+    auto year_month_day = ArrayFromJSON(year_month_day_type,
+                                        R"([{"year": 1970, "month": 1, "day": 1},
+                          {"year": 2000, "month": 3, "day": 1},
+                          {"year": 1899, "month": 1, "day": 1},
+                          {"year": 2033, "month": 5, "day": 18},
+                          {"year": 2020, "month": 1, "day": 1},
+                          {"year": 2019, "month": 12, "day": 31},
+                          {"year": 2019, "month": 12, "day": 30},
+                          {"year": 2009, "month": 12, "day": 31},
+                          {"year": 2010, "month": 1, "day": 1},
+                          {"year": 2010, "month": 1, "day": 3},
+                          {"year": 2010, "month": 1, "day": 4},
+                          {"year": 2006, "month": 1, "day": 1},
+                          {"year": 2005, "month": 12, "day": 31},
+                          {"year": 2008, "month": 12, "day": 28},
+                          {"year": 2008, "month": 12, "day": 29},
+                          {"year": 2012, "month": 1, "day": 1}, null])");

Review comment:
       Looks like you could add assertions to `TestZoned2` and `TestNonexistentTimezone` indeed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org