You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/08/25 13:54:58 UTC

[GitHub] [arrow] rok opened a new pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

rok opened a new pull request #10998:
URL: https://github.com/apache/arrow/pull/10998


   This is to resolve [ARROW-13684](https://issues.apache.org/jira/browse/ARROW-13684).
   
   1. Default strftime string is now `%Y-%m-%dT%H:%M:%S`. Perhaps `%Y-%m-%dT%H:%M:%S%z` would be better?
   2. Timestamps without timezone are now strftime-ed as if they were in `UTC`. Not sure this is the way to go. What if the local time is really invalid but we can't tell?
   3. Document `%S` behavior. What would be a good location to do that?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
rok commented on pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#issuecomment-913748927


   Thanks @westonpace @jorisvandenbossche @pitrou !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
rok commented on pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#issuecomment-905796113


   > > Default strftime string is now `%Y-%m-%dT%H:%M:%S`. Perhaps `%Y-%m-%dT%H:%M:%S%z` would be better?
   > 
   > Not fully sure about this one. In [#10647 (comment)](https://github.com/apache/arrow/pull/10647#issuecomment-901783928) I mentioned the options of showing no timezone information (what you did now), include a numeric offset (so adding `%z`) or erroring. And for UTC it's also an option to use the literal "Z".
   > 
   > Including a numeric offset of using literal Z (for UTC) both mean that the default format depends on the type (because for timestamp without timezone we shouldn't add an offset).
   
   Right. And for the option where we would have UTC timestamps having "Z" at the end would require something like:
   ```
   if (timezone == "UTC" && self.options.format == "") {
     self.options.format = "%Y-%m-%dT%H:%M:%SZ";
   } else {
     self.options.format = "%Y-%m-%dT%H:%M:%S";
   }
   ```
   Where we would have `self.options.format = ""` by default.
   I'm not sure this is really worth it. I'd keep the timezone out of the default format.
   
   > > Timestamps without timezone are now strftime-ed as if they were in `UTC`. Not sure this is the way to go. What if the local time is really invalid but we can't tell?
   > 
   > Timestamps without timezone should be formatted as is, showing the wall clock time they represent. So I think formatting as if they are in UTC but without any timezone indication is correct. Although we should not allow adding timezone information with `%Z` or `%z` in those cases.
   > I don't think we need to care about invalid local times when printing, that's a user responsibility (and anyway, you only know if a time is invalid once you assume a certain timezone, which we don't do in strftime).
   
   Ok, let's return invalid when detecting `%z`/`%Z` and a timezone-less timestamp then?
   
   > > Document %S behavior. What would be a good location to do that?
   > 
   > You could start with the compute.rst table notes, but personally I would also include it in the docstring.
   
   Sounds good.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on a change in pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
rok commented on a change in pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#discussion_r698648731



##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -836,6 +841,10 @@ const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
      "The time format string and locale can be set using StrftimeOptions.\n"
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "

Review comment:
       Updated. Could you please check if it's ok now. I'm not sure about the convention.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#issuecomment-905522838


   https://issues.apache.org/jira/browse/ARROW-13684


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
rok commented on pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#issuecomment-909299852


   Thanks for the review @jorisvandenbossche @pitrou.
   I addressed the comments and pushed the changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#discussion_r698640772



##########
File path: docs/source/cpp/compute.rst
##########
@@ -1101,19 +1101,29 @@ number of input and output types.  The type to cast to can be passed in a
 :struct:`CastOptions` instance.  As an alternative, the same service is
 provided by a concrete function :func:`~arrow::compute::Cast`.
 
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| Function name            | Arity      | Input types        | Output type      | Options class                |
-+==========================+============+====================+==================+==============================+
-| cast                     | Unary      | Many               | Variable         | :struct:`CastOptions`        |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strftime                 | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strptime                 | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| Function name   | Arity      | Input types        | Output type      | Options class                | Notes |
++=================+============+====================+==================+==============================+=======+
+| cast            | Unary      | Many               | Variable         | :struct:`CastOptions`        |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strftime        | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    | \(1)  |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strptime        | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
 
 The conversions available with ``cast`` are listed below.  In all cases, a
 null input value is converted into a null output value.
 
+* \(1) Output precision of seconds (``%S``) flag depends on the input timestamp
+  precision. If the number of seconds can not be exactly represented with seconds,
+  then the format is a decimal floating point number with a fixed format and a
+  precision matching that of the precision of the input. Note precision increases
+  three decimal points points per step going from seconds to nanoseconds.

Review comment:
       This says "resolved" but I don't see the changes. Did you forget to push?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#discussion_r698641541



##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -836,6 +841,10 @@ const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
      "The time format string and locale can be set using StrftimeOptions.\n"
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "
+     "Timestamps with second precision are represented as integers while milliseconds, "
+     "microsecond and nanoseconds are represented as fixed floating point numbers with "
+     "3, 6 and 9 decimal points respectively."

Review comment:
       Similarly, this change doesn't seem to have been pushed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou closed pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
pitrou closed pull request #10998:
URL: https://github.com/apache/arrow/pull/10998


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] westonpace commented on a change in pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
westonpace commented on a change in pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#discussion_r696849507



##########
File path: python/pyarrow/tests/test_compute.py
##########
@@ -1492,18 +1492,25 @@ def _fix_timestamp(s):
 
         # Test setting locale
         tsa = pa.array(ts, type=pa.timestamp("s", timezone))
-        options = pc.StrftimeOptions("%Y-%m-%dT%H:%M:%SZ", "C")

Review comment:
       Did you leave at least one test case where you do print the time zone?

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal_test.cc
##########
@@ -414,12 +414,18 @@ TEST_F(ScalarTemporalTest, Strftime) {
 }
 
 TEST_F(ScalarTemporalTest, StrftimeNoTimezone) {
+  auto options_default = StrftimeOptions();
   const char* seconds = R"(["1970-01-01T00:00:59", null])";
   auto arr = ArrayFromJSON(timestamp(TimeUnit::SECOND), seconds);
+
+  CheckScalarUnary("strftime", timestamp(TimeUnit::SECOND), seconds, utf8(), seconds,
+                   &options_default);
   EXPECT_RAISES_WITH_MESSAGE_THAT(
-      Invalid,
-      testing::HasSubstr("Timestamps without a time zone cannot be reliably formatted"),
-      Strftime(arr, StrftimeOptions()));
+      Invalid, testing::HasSubstr("Invalid: Timezone not present, cannot print"),

Review comment:
       Nit: I don't know if `cannot print` is quite right.  The user might be running strftime to store or display on a GUI or send to some other tool.  Perhaps `cannot convert to string`?  Or `cannot format`?

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -835,6 +839,10 @@ const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
      "The time format string and locale can be set using StrftimeOptions.\n"
+     "Output precision of seconds %S flag depends on the input timestamp precision. "
+     "Timestamps with second precision are represented as integers while microsecond "
+     "precision timestamps are represented as a floating point number with 6 decimal "
+     "points."

Review comment:
       This comment seems to suggest that only microsecond and second are supported.  I like the wording down below which makes it more clear that seconds through nanoseconds are supported.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on a change in pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
rok commented on a change in pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#discussion_r697041218



##########
File path: python/pyarrow/tests/test_compute.py
##########
@@ -1492,18 +1492,25 @@ def _fix_timestamp(s):
 
         # Test setting locale
         tsa = pa.array(ts, type=pa.timestamp("s", timezone))
-        options = pc.StrftimeOptions("%Y-%m-%dT%H:%M:%SZ", "C")

Review comment:
       Actually all formatting flags [are tested](https://github.com/apache/arrow/blob/554a1871f02adc9f094562dccec321c95838eabb/python/pyarrow/tests/test_compute.py#L1483) (but not visible in this diff). I've added a default format plus timezone test to make it more explicit.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on a change in pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
rok commented on a change in pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#discussion_r697041678



##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -835,6 +839,10 @@ const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
      "The time format string and locale can be set using StrftimeOptions.\n"
+     "Output precision of seconds %S flag depends on the input timestamp precision. "
+     "Timestamps with second precision are represented as integers while microsecond "
+     "precision timestamps are represented as a floating point number with 6 decimal "
+     "points."

Review comment:
       Indeed. Changed the language. Could you check if it's good now?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
rok commented on pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#issuecomment-909299852


   Thanks for the review @jorisvandenbossche @pitrou.
   I addressed the comments and pushed the changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#discussion_r698640772



##########
File path: docs/source/cpp/compute.rst
##########
@@ -1101,19 +1101,29 @@ number of input and output types.  The type to cast to can be passed in a
 :struct:`CastOptions` instance.  As an alternative, the same service is
 provided by a concrete function :func:`~arrow::compute::Cast`.
 
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| Function name            | Arity      | Input types        | Output type      | Options class                |
-+==========================+============+====================+==================+==============================+
-| cast                     | Unary      | Many               | Variable         | :struct:`CastOptions`        |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strftime                 | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strptime                 | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| Function name   | Arity      | Input types        | Output type      | Options class                | Notes |
++=================+============+====================+==================+==============================+=======+
+| cast            | Unary      | Many               | Variable         | :struct:`CastOptions`        |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strftime        | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    | \(1)  |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strptime        | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
 
 The conversions available with ``cast`` are listed below.  In all cases, a
 null input value is converted into a null output value.
 
+* \(1) Output precision of seconds (``%S``) flag depends on the input timestamp
+  precision. If the number of seconds can not be exactly represented with seconds,
+  then the format is a decimal floating point number with a fixed format and a
+  precision matching that of the precision of the input. Note precision increases
+  three decimal points points per step going from seconds to nanoseconds.

Review comment:
       This says "resolved" but I don't see the changes. Did you forget to push?

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -836,6 +841,10 @@ const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
      "The time format string and locale can be set using StrftimeOptions.\n"
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "
+     "Timestamps with second precision are represented as integers while milliseconds, "
+     "microsecond and nanoseconds are represented as fixed floating point numbers with "
+     "3, 6 and 9 decimal points respectively."

Review comment:
       Similarly, this change doesn't seem to have been pushed.

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -836,6 +841,10 @@ const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
      "The time format string and locale can be set using StrftimeOptions.\n"
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "

Review comment:
       I don't know if the elaboration about the "%S" is useful to put here, but if it is, can you follow the convention for line wrapping?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
rok commented on pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#issuecomment-905522710


   @jorisvandenbossche 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
rok commented on pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#issuecomment-909299852


   Thanks for the review @jorisvandenbossche @pitrou.
   I addressed the comments and pushed the changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
rok commented on pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#issuecomment-907155724


   @jorisvandenbossche @westonpace I've made a [minor change in the text](https://github.com/apache/arrow/commit/9ca3ccaead312d67e99b8744f2fc478412bf9dda) (decimal points -> decimal places).
   Other than that I think this is done.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] westonpace commented on a change in pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
westonpace commented on a change in pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#discussion_r697042488



##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -835,6 +839,10 @@ const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
      "The time format string and locale can be set using StrftimeOptions.\n"
+     "Output precision of seconds %S flag depends on the input timestamp precision. "
+     "Timestamps with second precision are represented as integers while microsecond "
+     "precision timestamps are represented as a floating point number with 6 decimal "
+     "points."

Review comment:
       Very clear.  Thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on a change in pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
rok commented on a change in pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#discussion_r697041446



##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal_test.cc
##########
@@ -414,12 +414,18 @@ TEST_F(ScalarTemporalTest, Strftime) {
 }
 
 TEST_F(ScalarTemporalTest, StrftimeNoTimezone) {
+  auto options_default = StrftimeOptions();
   const char* seconds = R"(["1970-01-01T00:00:59", null])";
   auto arr = ArrayFromJSON(timestamp(TimeUnit::SECOND), seconds);
+
+  CheckScalarUnary("strftime", timestamp(TimeUnit::SECOND), seconds, utf8(), seconds,
+                   &options_default);
   EXPECT_RAISES_WITH_MESSAGE_THAT(
-      Invalid,
-      testing::HasSubstr("Timestamps without a time zone cannot be reliably formatted"),
-      Strftime(arr, StrftimeOptions()));
+      Invalid, testing::HasSubstr("Invalid: Timezone not present, cannot print"),

Review comment:
       Yeah, good point. I went with your first proposal.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#discussion_r698641955



##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -836,6 +841,10 @@ const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
      "The time format string and locale can be set using StrftimeOptions.\n"
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "

Review comment:
       I don't know if the elaboration about the "%S" is useful to put here, but if it is, can you follow the convention for line wrapping?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on a change in pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
rok commented on a change in pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#discussion_r699389738



##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -835,8 +840,12 @@ const FunctionDoc subsecond_doc{
 const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
-     "The time format string and locale can be set using StrftimeOptions.\n"
-     "An error is returned if the timestamps don't have a defined timezone,\n"
+     "The time format string and locale can be set using StrftimeOptions. "
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "
+     "Timestamps with second precision are represented as integers while milliseconds, "
+     "microsecond and nanoseconds are represented as fixed floating point numbers with "
+     "3, 6 and 9 decimal places respectively.\n"
+     "An error is returned if the timestamps don't have a defined timezone,"

Review comment:
       It seems error description is outdated for all extract kernels. Thanks for pointing it out! Fixing.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on a change in pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
rok commented on a change in pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#discussion_r699389738



##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -835,8 +840,12 @@ const FunctionDoc subsecond_doc{
 const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
-     "The time format string and locale can be set using StrftimeOptions.\n"
-     "An error is returned if the timestamps don't have a defined timezone,\n"
+     "The time format string and locale can be set using StrftimeOptions. "
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "
+     "Timestamps with second precision are represented as integers while milliseconds, "
+     "microsecond and nanoseconds are represented as fixed floating point numbers with "
+     "3, 6 and 9 decimal places respectively.\n"
+     "An error is returned if the timestamps don't have a defined timezone,"

Review comment:
       It seems error description is outdated for all extract kernels. Thanks for pointing it out! Fixing.

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -835,8 +840,12 @@ const FunctionDoc subsecond_doc{
 const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
-     "The time format string and locale can be set using StrftimeOptions.\n"
-     "An error is returned if the timestamps don't have a defined timezone,\n"
+     "The time format string and locale can be set using StrftimeOptions. "
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "
+     "Timestamps with second precision are represented as integers while milliseconds, "
+     "microsecond and nanoseconds are represented as fixed floating point numbers with "
+     "3, 6 and 9 decimal places respectively.\n"

Review comment:
       Done.

##########
File path: docs/source/cpp/compute.rst
##########
@@ -1094,19 +1094,29 @@ number of input and output types.  The type to cast to can be passed in a
 :struct:`CastOptions` instance.  As an alternative, the same service is
 provided by a concrete function :func:`~arrow::compute::Cast`.
 
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| Function name            | Arity      | Input types        | Output type      | Options class                |
-+==========================+============+====================+==================+==============================+
-| cast                     | Unary      | Many               | Variable         | :struct:`CastOptions`        |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strftime                 | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strptime                 | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| Function name   | Arity      | Input types        | Output type      | Options class                | Notes |
++=================+============+====================+==================+==============================+=======+
+| cast            | Unary      | Many               | Variable         | :struct:`CastOptions`        |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strftime        | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    | \(1)  |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strptime        | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
 
 The conversions available with ``cast`` are listed below.  In all cases, a
 null input value is converted into a null output value.
 
+* \(1) Output precision of seconds (``%S``) flag depends on the input timestamp
+  precision. If the number of seconds can not be exactly represented with seconds,

Review comment:
       Makes sense yeah. Done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
rok commented on pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#issuecomment-905878409


   @jorisvandenbossche I've pushed changes for second and third (printing timezoneless timestamps and docs) point. I left the first as was (not timezone data by default).
   Please see if this looks ok.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on a change in pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
rok commented on a change in pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#discussion_r697045398



##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -836,6 +841,10 @@ const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
      "The time format string and locale can be set using StrftimeOptions.\n"
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "
+     "Timestamps with second precision are represented as integers while milliseconds, "
+     "microsecond and nanoseconds are represented as fixed floating point numbers with "
+     "3, 6 and 9 decimal points respectively."

Review comment:
       ```suggestion
        "3, 6 and 9 decimal places respectively."
   ```

##########
File path: docs/source/cpp/compute.rst
##########
@@ -1101,19 +1101,29 @@ number of input and output types.  The type to cast to can be passed in a
 :struct:`CastOptions` instance.  As an alternative, the same service is
 provided by a concrete function :func:`~arrow::compute::Cast`.
 
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| Function name            | Arity      | Input types        | Output type      | Options class                |
-+==========================+============+====================+==================+==============================+
-| cast                     | Unary      | Many               | Variable         | :struct:`CastOptions`        |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strftime                 | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strptime                 | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| Function name   | Arity      | Input types        | Output type      | Options class                | Notes |
++=================+============+====================+==================+==============================+=======+
+| cast            | Unary      | Many               | Variable         | :struct:`CastOptions`        |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strftime        | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    | \(1)  |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strptime        | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
 
 The conversions available with ``cast`` are listed below.  In all cases, a
 null input value is converted into a null output value.
 
+* \(1) Output precision of seconds (``%S``) flag depends on the input timestamp
+  precision. If the number of seconds can not be exactly represented with seconds,
+  then the format is a decimal floating point number with a fixed format and a
+  precision matching that of the precision of the input. Note precision increases
+  three decimal points points per step going from seconds to nanoseconds.

Review comment:
       ```suggestion
     three decimal places per step going from seconds to nanoseconds.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on a change in pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#discussion_r699271130



##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -835,8 +840,12 @@ const FunctionDoc subsecond_doc{
 const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
-     "The time format string and locale can be set using StrftimeOptions.\n"
-     "An error is returned if the timestamps don't have a defined timezone,\n"
+     "The time format string and locale can be set using StrftimeOptions. "
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "
+     "Timestamps with second precision are represented as integers while milliseconds, "
+     "microsecond and nanoseconds are represented as fixed floating point numbers with "
+     "3, 6 and 9 decimal places respectively.\n"

Review comment:
       I would also add something about "To obtain integer seconds, cast to timestamp with second resolution" to be explicit about the workaround.

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -835,8 +840,12 @@ const FunctionDoc subsecond_doc{
 const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
-     "The time format string and locale can be set using StrftimeOptions.\n"
-     "An error is returned if the timestamps don't have a defined timezone,\n"
+     "The time format string and locale can be set using StrftimeOptions. "
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "
+     "Timestamps with second precision are represented as integers while milliseconds, "
+     "microsecond and nanoseconds are represented as fixed floating point numbers with "
+     "3, 6 and 9 decimal places respectively.\n"
+     "An error is returned if the timestamps don't have a defined timezone,"

Review comment:
       This is no longer up to date

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -476,11 +476,16 @@ struct Strftime {
   static Result<Strftime> Make(KernelContext* ctx, const DataType& type) {
     const StrftimeOptions& options = StrftimeState::Get(ctx);
 
-    const auto& timezone = GetInputTimezone(type);
+    auto timezone = GetInputTimezone(type);
     if (timezone.empty()) {
-      return Status::Invalid(
-          "Timestamps without a time zone cannot be reliably formatted.");
+      if ((options.format.find("%z") != std::string::npos) ||
+          (options.format.find("%Z") != std::string::npos)) {
+        return Status::Invalid("Timezone not present, cannot convert to string: ",

Review comment:
       ```suggestion
           return Status::Invalid("Timezone not present, cannot convert to string with timezone: ",
   ```

##########
File path: docs/source/cpp/compute.rst
##########
@@ -1094,19 +1094,29 @@ number of input and output types.  The type to cast to can be passed in a
 :struct:`CastOptions` instance.  As an alternative, the same service is
 provided by a concrete function :func:`~arrow::compute::Cast`.
 
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| Function name            | Arity      | Input types        | Output type      | Options class                |
-+==========================+============+====================+==================+==============================+
-| cast                     | Unary      | Many               | Variable         | :struct:`CastOptions`        |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strftime                 | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strptime                 | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| Function name   | Arity      | Input types        | Output type      | Options class                | Notes |
++=================+============+====================+==================+==============================+=======+
+| cast            | Unary      | Many               | Variable         | :struct:`CastOptions`        |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strftime        | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    | \(1)  |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strptime        | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
 
 The conversions available with ``cast`` are listed below.  In all cases, a
 null input value is converted into a null output value.
 
+* \(1) Output precision of seconds (``%S``) flag depends on the input timestamp
+  precision. If the number of seconds can not be exactly represented with seconds,

Review comment:
       It seems you used some explanation about this from https://howardhinnant.github.io/date/date.html#to_stream_formatting, but I would maybe use the same text as you added above in `strftime_doc`, as I think that's clearer / more relevant from Arrow's perspective.

##########
File path: docs/source/cpp/compute.rst
##########
@@ -1094,19 +1094,29 @@ number of input and output types.  The type to cast to can be passed in a
 :struct:`CastOptions` instance.  As an alternative, the same service is
 provided by a concrete function :func:`~arrow::compute::Cast`.
 
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| Function name            | Arity      | Input types        | Output type      | Options class                |
-+==========================+============+====================+==================+==============================+
-| cast                     | Unary      | Many               | Variable         | :struct:`CastOptions`        |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strftime                 | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strptime                 | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| Function name   | Arity      | Input types        | Output type      | Options class                | Notes |
++=================+============+====================+==================+==============================+=======+
+| cast            | Unary      | Many               | Variable         | :struct:`CastOptions`        |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strftime        | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    | \(1)  |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strptime        | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
 
 The conversions available with ``cast`` are listed below.  In all cases, a
 null input value is converted into a null output value.
 
+* \(1) Output precision of seconds (``%S``) flag depends on the input timestamp
+  precision. If the number of seconds can not be exactly represented with seconds,

Review comment:
       I don't think this is accurate? It doesn't depend on the number of seconds present, but solely on the unit of the timestamp type?

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -836,6 +841,10 @@ const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
      "The time format string and locale can be set using StrftimeOptions.\n"
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "

Review comment:
       > I don't know if the elaboration about the "%S" is useful to put here
   
   @pitrou since this behaves differently as eg Python and R's version of `strftime`, I think it's important to note this somewhere (but can also be somewhere else, eg in compute.rst as a note below the table. EDIT: I see it's already added there)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on a change in pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
rok commented on a change in pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#discussion_r698647919



##########
File path: docs/source/cpp/compute.rst
##########
@@ -1101,19 +1101,29 @@ number of input and output types.  The type to cast to can be passed in a
 :struct:`CastOptions` instance.  As an alternative, the same service is
 provided by a concrete function :func:`~arrow::compute::Cast`.
 
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| Function name            | Arity      | Input types        | Output type      | Options class                |
-+==========================+============+====================+==================+==============================+
-| cast                     | Unary      | Many               | Variable         | :struct:`CastOptions`        |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strftime                 | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strptime                 | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| Function name   | Arity      | Input types        | Output type      | Options class                | Notes |
++=================+============+====================+==================+==============================+=======+
+| cast            | Unary      | Many               | Variable         | :struct:`CastOptions`        |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strftime        | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    | \(1)  |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strptime        | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
 
 The conversions available with ``cast`` are listed below.  In all cases, a
 null input value is converted into a null output value.
 
+* \(1) Output precision of seconds (``%S``) flag depends on the input timestamp
+  precision. If the number of seconds can not be exactly represented with seconds,
+  then the format is a decimal floating point number with a fixed format and a
+  precision matching that of the precision of the input. Note precision increases
+  three decimal points points per step going from seconds to nanoseconds.

Review comment:
       Thanks for catching that. I think what happened was an accidental force push.

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -836,6 +841,10 @@ const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
      "The time format string and locale can be set using StrftimeOptions.\n"
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "

Review comment:
       Updated. Could you please check if it's ok now. I'm not sure about the convention.

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -836,6 +841,10 @@ const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
      "The time format string and locale can be set using StrftimeOptions.\n"
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "

Review comment:
       This evaluates to true in MySQL on sqlfiddle:
   ```SQL
   SELECT WEEK('2008-1-6', 1) != WEEK('2008-1-6', 4)
   ```

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -836,6 +841,10 @@ const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
      "The time format string and locale can be set using StrftimeOptions.\n"
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "

Review comment:
       This evaluates to true in MySQL on sqlfiddle:
   ```SQL
   SELECT WEEK('2008-1-6', 1) != WEEK('2008-1-6', 4)
   ```
   
   Mode | First day of week | Range | Week 1 is the first week
   -- | -- | -- | --
   0 | Sunday | 0-53 | with a Sunday in this year
   1 | Monday | 0-53 | with 4 or more days this year
   2 | Sunday | 1-53 | with a Sunday in this year
   3 | Monday | 1-53 | with 4 or more days this year
   4 | Sunday | 0-53 | with 4 or more days this year
   5 | Monday | 0-53 | with a Monday in this year
   6 | Sunday | 1-53 | with 4 or more days this year
   7 | Monday | 1-53 | with a Monday in this year
   

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -835,8 +840,12 @@ const FunctionDoc subsecond_doc{
 const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
-     "The time format string and locale can be set using StrftimeOptions.\n"
-     "An error is returned if the timestamps don't have a defined timezone,\n"
+     "The time format string and locale can be set using StrftimeOptions. "
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "
+     "Timestamps with second precision are represented as integers while milliseconds, "
+     "microsecond and nanoseconds are represented as fixed floating point numbers with "
+     "3, 6 and 9 decimal places respectively.\n"
+     "An error is returned if the timestamps don't have a defined timezone,"

Review comment:
       It seems error description is outdated for all extract kernels. Thanks for pointing it out! Fixing.

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -835,8 +840,12 @@ const FunctionDoc subsecond_doc{
 const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
-     "The time format string and locale can be set using StrftimeOptions.\n"
-     "An error is returned if the timestamps don't have a defined timezone,\n"
+     "The time format string and locale can be set using StrftimeOptions. "
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "
+     "Timestamps with second precision are represented as integers while milliseconds, "
+     "microsecond and nanoseconds are represented as fixed floating point numbers with "
+     "3, 6 and 9 decimal places respectively.\n"

Review comment:
       Done.

##########
File path: docs/source/cpp/compute.rst
##########
@@ -1094,19 +1094,29 @@ number of input and output types.  The type to cast to can be passed in a
 :struct:`CastOptions` instance.  As an alternative, the same service is
 provided by a concrete function :func:`~arrow::compute::Cast`.
 
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| Function name            | Arity      | Input types        | Output type      | Options class                |
-+==========================+============+====================+==================+==============================+
-| cast                     | Unary      | Many               | Variable         | :struct:`CastOptions`        |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strftime                 | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strptime                 | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| Function name   | Arity      | Input types        | Output type      | Options class                | Notes |
++=================+============+====================+==================+==============================+=======+
+| cast            | Unary      | Many               | Variable         | :struct:`CastOptions`        |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strftime        | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    | \(1)  |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strptime        | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
 
 The conversions available with ``cast`` are listed below.  In all cases, a
 null input value is converted into a null output value.
 
+* \(1) Output precision of seconds (``%S``) flag depends on the input timestamp
+  precision. If the number of seconds can not be exactly represented with seconds,

Review comment:
       Makes sense yeah. Done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on a change in pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
rok commented on a change in pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#discussion_r699390721



##########
File path: docs/source/cpp/compute.rst
##########
@@ -1094,19 +1094,29 @@ number of input and output types.  The type to cast to can be passed in a
 :struct:`CastOptions` instance.  As an alternative, the same service is
 provided by a concrete function :func:`~arrow::compute::Cast`.
 
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| Function name            | Arity      | Input types        | Output type      | Options class                |
-+==========================+============+====================+==================+==============================+
-| cast                     | Unary      | Many               | Variable         | :struct:`CastOptions`        |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strftime                 | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strptime                 | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| Function name   | Arity      | Input types        | Output type      | Options class                | Notes |
++=================+============+====================+==================+==============================+=======+
+| cast            | Unary      | Many               | Variable         | :struct:`CastOptions`        |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strftime        | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    | \(1)  |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strptime        | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
 
 The conversions available with ``cast`` are listed below.  In all cases, a
 null input value is converted into a null output value.
 
+* \(1) Output precision of seconds (``%S``) flag depends on the input timestamp
+  precision. If the number of seconds can not be exactly represented with seconds,

Review comment:
       Makes sense yeah. Done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on a change in pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
rok commented on a change in pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#discussion_r698660433



##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -836,6 +841,10 @@ const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
      "The time format string and locale can be set using StrftimeOptions.\n"
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "

Review comment:
       This evaluates to true in MySQL on sqlfiddle:
   ```SQL
   SELECT WEEK('2008-1-6', 1) != WEEK('2008-1-6', 4)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on a change in pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#discussion_r699271130



##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -835,8 +840,12 @@ const FunctionDoc subsecond_doc{
 const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
-     "The time format string and locale can be set using StrftimeOptions.\n"
-     "An error is returned if the timestamps don't have a defined timezone,\n"
+     "The time format string and locale can be set using StrftimeOptions. "
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "
+     "Timestamps with second precision are represented as integers while milliseconds, "
+     "microsecond and nanoseconds are represented as fixed floating point numbers with "
+     "3, 6 and 9 decimal places respectively.\n"

Review comment:
       I would also add something about "To obtain integer seconds, cast to timestamp with second resolution" to be explicit about the workaround.

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -835,8 +840,12 @@ const FunctionDoc subsecond_doc{
 const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
-     "The time format string and locale can be set using StrftimeOptions.\n"
-     "An error is returned if the timestamps don't have a defined timezone,\n"
+     "The time format string and locale can be set using StrftimeOptions. "
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "
+     "Timestamps with second precision are represented as integers while milliseconds, "
+     "microsecond and nanoseconds are represented as fixed floating point numbers with "
+     "3, 6 and 9 decimal places respectively.\n"
+     "An error is returned if the timestamps don't have a defined timezone,"

Review comment:
       This is no longer up to date

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -476,11 +476,16 @@ struct Strftime {
   static Result<Strftime> Make(KernelContext* ctx, const DataType& type) {
     const StrftimeOptions& options = StrftimeState::Get(ctx);
 
-    const auto& timezone = GetInputTimezone(type);
+    auto timezone = GetInputTimezone(type);
     if (timezone.empty()) {
-      return Status::Invalid(
-          "Timestamps without a time zone cannot be reliably formatted.");
+      if ((options.format.find("%z") != std::string::npos) ||
+          (options.format.find("%Z") != std::string::npos)) {
+        return Status::Invalid("Timezone not present, cannot convert to string: ",

Review comment:
       ```suggestion
           return Status::Invalid("Timezone not present, cannot convert to string with timezone: ",
   ```

##########
File path: docs/source/cpp/compute.rst
##########
@@ -1094,19 +1094,29 @@ number of input and output types.  The type to cast to can be passed in a
 :struct:`CastOptions` instance.  As an alternative, the same service is
 provided by a concrete function :func:`~arrow::compute::Cast`.
 
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| Function name            | Arity      | Input types        | Output type      | Options class                |
-+==========================+============+====================+==================+==============================+
-| cast                     | Unary      | Many               | Variable         | :struct:`CastOptions`        |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strftime                 | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strptime                 | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| Function name   | Arity      | Input types        | Output type      | Options class                | Notes |
++=================+============+====================+==================+==============================+=======+
+| cast            | Unary      | Many               | Variable         | :struct:`CastOptions`        |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strftime        | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    | \(1)  |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strptime        | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
 
 The conversions available with ``cast`` are listed below.  In all cases, a
 null input value is converted into a null output value.
 
+* \(1) Output precision of seconds (``%S``) flag depends on the input timestamp
+  precision. If the number of seconds can not be exactly represented with seconds,

Review comment:
       It seems you used some explanation about this from https://howardhinnant.github.io/date/date.html#to_stream_formatting, but I would maybe use the same text as you added above in `strftime_doc`, as I think that's clearer / more relevant from Arrow's perspective.

##########
File path: docs/source/cpp/compute.rst
##########
@@ -1094,19 +1094,29 @@ number of input and output types.  The type to cast to can be passed in a
 :struct:`CastOptions` instance.  As an alternative, the same service is
 provided by a concrete function :func:`~arrow::compute::Cast`.
 
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| Function name            | Arity      | Input types        | Output type      | Options class                |
-+==========================+============+====================+==================+==============================+
-| cast                     | Unary      | Many               | Variable         | :struct:`CastOptions`        |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strftime                 | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strptime                 | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| Function name   | Arity      | Input types        | Output type      | Options class                | Notes |
++=================+============+====================+==================+==============================+=======+
+| cast            | Unary      | Many               | Variable         | :struct:`CastOptions`        |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strftime        | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    | \(1)  |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strptime        | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
 
 The conversions available with ``cast`` are listed below.  In all cases, a
 null input value is converted into a null output value.
 
+* \(1) Output precision of seconds (``%S``) flag depends on the input timestamp
+  precision. If the number of seconds can not be exactly represented with seconds,

Review comment:
       I don't think this is accurate? It doesn't depend on the number of seconds present, but solely on the unit of the timestamp type?

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -836,6 +841,10 @@ const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
      "The time format string and locale can be set using StrftimeOptions.\n"
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "

Review comment:
       > I don't know if the elaboration about the "%S" is useful to put here
   
   @pitrou since this behaves differently as eg Python and R's version of `strftime`, I think it's important to note this somewhere (but can also be somewhere else, eg in compute.rst as a note below the table. EDIT: I see it's already added there)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on a change in pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
rok commented on a change in pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#discussion_r699390143



##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -835,8 +840,12 @@ const FunctionDoc subsecond_doc{
 const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
-     "The time format string and locale can be set using StrftimeOptions.\n"
-     "An error is returned if the timestamps don't have a defined timezone,\n"
+     "The time format string and locale can be set using StrftimeOptions. "
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "
+     "Timestamps with second precision are represented as integers while milliseconds, "
+     "microsecond and nanoseconds are represented as fixed floating point numbers with "
+     "3, 6 and 9 decimal places respectively.\n"

Review comment:
       Done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on a change in pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
rok commented on a change in pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#discussion_r698647919



##########
File path: docs/source/cpp/compute.rst
##########
@@ -1101,19 +1101,29 @@ number of input and output types.  The type to cast to can be passed in a
 :struct:`CastOptions` instance.  As an alternative, the same service is
 provided by a concrete function :func:`~arrow::compute::Cast`.
 
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| Function name            | Arity      | Input types        | Output type      | Options class                |
-+==========================+============+====================+==================+==============================+
-| cast                     | Unary      | Many               | Variable         | :struct:`CastOptions`        |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strftime                 | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strptime                 | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| Function name   | Arity      | Input types        | Output type      | Options class                | Notes |
++=================+============+====================+==================+==============================+=======+
+| cast            | Unary      | Many               | Variable         | :struct:`CastOptions`        |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strftime        | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    | \(1)  |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strptime        | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
 
 The conversions available with ``cast`` are listed below.  In all cases, a
 null input value is converted into a null output value.
 
+* \(1) Output precision of seconds (``%S``) flag depends on the input timestamp
+  precision. If the number of seconds can not be exactly represented with seconds,
+  then the format is a decimal floating point number with a fixed format and a
+  precision matching that of the precision of the input. Note precision increases
+  three decimal points points per step going from seconds to nanoseconds.

Review comment:
       Thanks for catching that. I think what happened was an accidental force push.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
rok commented on pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#issuecomment-906809114


   Thanks for the comments @westonpace !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rok commented on a change in pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
rok commented on a change in pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#discussion_r698660433



##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -836,6 +841,10 @@ const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
      "The time format string and locale can be set using StrftimeOptions.\n"
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "

Review comment:
       This evaluates to true in MySQL on sqlfiddle:
   ```SQL
   SELECT WEEK('2008-1-6', 1) != WEEK('2008-1-6', 4)
   ```
   
   Mode | First day of week | Range | Week 1 is the first week
   -- | -- | -- | --
   0 | Sunday | 0-53 | with a Sunday in this year
   1 | Monday | 0-53 | with 4 or more days this year
   2 | Sunday | 1-53 | with a Sunday in this year
   3 | Monday | 1-53 | with 4 or more days this year
   4 | Sunday | 0-53 | with 4 or more days this year
   5 | Monday | 0-53 | with a Monday in this year
   6 | Sunday | 1-53 | with 4 or more days this year
   7 | Monday | 1-53 | with a Monday in this year
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#issuecomment-905604981


   > Default strftime string is now `%Y-%m-%dT%H:%M:%S`. Perhaps `%Y-%m-%dT%H:%M:%S%z` would be better?
   
   Not fully sure about this one. In https://github.com/apache/arrow/pull/10647#issuecomment-901783928 I mentioned the options of showing no timezone information (what you did now), include a numeric offset (so adding `%z`) or erroring. And for UTC it's also an option to use the literal "Z". 
   
   Including a numeric offset of using literal Z (for UTC) both mean that the default format depends on the type (because for timestamp without timezone we shouldn't add an offset).
   
   > Timestamps without timezone are now strftime-ed as if they were in `UTC`. Not sure this is the way to go. What if the local time is really invalid but we can't tell?
   
   Timestamps without timezone should be formatted as is, showing the wall clock time they represent. So I think formatting as if they are in UTC but without any timezone indication is correct. Although we should not allow adding timezone information with `%Z` or `%z` in those cases. 
   I don't think we need to care about invalid local times when printing, that's a user responsibility (and anyway, you only know if a time is invalid once you assume a certain timezone, which we don't do in strftime). 
   
   > Document %S behavior. What would be a good location to do that?
   
   You could start with the compute.rst table notes, but personally I would also include it in the docstring.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #10998: ARROW-13684: [C++][Compute] Strftime kernel follow-up

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on a change in pull request #10998:
URL: https://github.com/apache/arrow/pull/10998#discussion_r699271130



##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -835,8 +840,12 @@ const FunctionDoc subsecond_doc{
 const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
-     "The time format string and locale can be set using StrftimeOptions.\n"
-     "An error is returned if the timestamps don't have a defined timezone,\n"
+     "The time format string and locale can be set using StrftimeOptions. "
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "
+     "Timestamps with second precision are represented as integers while milliseconds, "
+     "microsecond and nanoseconds are represented as fixed floating point numbers with "
+     "3, 6 and 9 decimal places respectively.\n"

Review comment:
       I would also add something about "To obtain integer seconds, cast to timestamp with second resolution" to be explicit about the workaround.

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -835,8 +840,12 @@ const FunctionDoc subsecond_doc{
 const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
-     "The time format string and locale can be set using StrftimeOptions.\n"
-     "An error is returned if the timestamps don't have a defined timezone,\n"
+     "The time format string and locale can be set using StrftimeOptions. "
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "
+     "Timestamps with second precision are represented as integers while milliseconds, "
+     "microsecond and nanoseconds are represented as fixed floating point numbers with "
+     "3, 6 and 9 decimal places respectively.\n"
+     "An error is returned if the timestamps don't have a defined timezone,"

Review comment:
       This is no longer up to date

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -476,11 +476,16 @@ struct Strftime {
   static Result<Strftime> Make(KernelContext* ctx, const DataType& type) {
     const StrftimeOptions& options = StrftimeState::Get(ctx);
 
-    const auto& timezone = GetInputTimezone(type);
+    auto timezone = GetInputTimezone(type);
     if (timezone.empty()) {
-      return Status::Invalid(
-          "Timestamps without a time zone cannot be reliably formatted.");
+      if ((options.format.find("%z") != std::string::npos) ||
+          (options.format.find("%Z") != std::string::npos)) {
+        return Status::Invalid("Timezone not present, cannot convert to string: ",

Review comment:
       ```suggestion
           return Status::Invalid("Timezone not present, cannot convert to string with timezone: ",
   ```

##########
File path: docs/source/cpp/compute.rst
##########
@@ -1094,19 +1094,29 @@ number of input and output types.  The type to cast to can be passed in a
 :struct:`CastOptions` instance.  As an alternative, the same service is
 provided by a concrete function :func:`~arrow::compute::Cast`.
 
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| Function name            | Arity      | Input types        | Output type      | Options class                |
-+==========================+============+====================+==================+==============================+
-| cast                     | Unary      | Many               | Variable         | :struct:`CastOptions`        |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strftime                 | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strptime                 | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| Function name   | Arity      | Input types        | Output type      | Options class                | Notes |
++=================+============+====================+==================+==============================+=======+
+| cast            | Unary      | Many               | Variable         | :struct:`CastOptions`        |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strftime        | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    | \(1)  |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strptime        | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
 
 The conversions available with ``cast`` are listed below.  In all cases, a
 null input value is converted into a null output value.
 
+* \(1) Output precision of seconds (``%S``) flag depends on the input timestamp
+  precision. If the number of seconds can not be exactly represented with seconds,

Review comment:
       It seems you used some explanation about this from https://howardhinnant.github.io/date/date.html#to_stream_formatting, but I would maybe use the same text as you added above in `strftime_doc`, as I think that's clearer / more relevant from Arrow's perspective.

##########
File path: docs/source/cpp/compute.rst
##########
@@ -1094,19 +1094,29 @@ number of input and output types.  The type to cast to can be passed in a
 :struct:`CastOptions` instance.  As an alternative, the same service is
 provided by a concrete function :func:`~arrow::compute::Cast`.
 
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| Function name            | Arity      | Input types        | Output type      | Options class                |
-+==========================+============+====================+==================+==============================+
-| cast                     | Unary      | Many               | Variable         | :struct:`CastOptions`        |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strftime                 | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
-| strptime                 | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |
-+--------------------------+------------+--------------------+------------------+------------------------------+
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| Function name   | Arity      | Input types        | Output type      | Options class                | Notes |
++=================+============+====================+==================+==============================+=======+
+| cast            | Unary      | Many               | Variable         | :struct:`CastOptions`        |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strftime        | Unary      | Timestamp          | String           | :struct:`StrftimeOptions`    | \(1)  |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
+| strptime        | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`    |       |
++-----------------+------------+--------------------+------------------+------------------------------+-------+
 
 The conversions available with ``cast`` are listed below.  In all cases, a
 null input value is converted into a null output value.
 
+* \(1) Output precision of seconds (``%S``) flag depends on the input timestamp
+  precision. If the number of seconds can not be exactly represented with seconds,

Review comment:
       I don't think this is accurate? It doesn't depend on the number of seconds present, but solely on the unit of the timestamp type?

##########
File path: cpp/src/arrow/compute/kernels/scalar_temporal.cc
##########
@@ -836,6 +841,10 @@ const FunctionDoc strftime_doc{
     "Format timestamps according to a format string",
     ("For each input timestamp, emit a formatted string.\n"
      "The time format string and locale can be set using StrftimeOptions.\n"
+     "Output precision of %S (seconds) flag depends on the input timestamp precision. "

Review comment:
       > I don't know if the elaboration about the "%S" is useful to put here
   
   @pitrou since this behaves differently as eg Python and R's version of `strftime`, I think it's important to note this somewhere (but can also be somewhere else, eg in compute.rst as a note below the table. EDIT: I see it's already added there)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org