You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "AlenkaF (via GitHub)" <gi...@apache.org> on 2023/12/18 09:29:47 UTC

[PR] GH-30117: [C++][Python] Add "Z" to the end of timestamp print string when tz defined [arrow]

AlenkaF opened a new pull request, #39272:
URL: https://github.com/apache/arrow/pull/39272

   ### What changes are included in this PR?
   
   This PR updates the PrettyPrint for Timestamp type so that "Z" is printed at the end of the output string if the timezone has been defined. This way we add minimum information about the values being stored in UTC.
   
   ### Are these changes tested?
   
   Yes.
   
   ### Are there any user-facing changes?
   
   There is a change in how `TimestampArray` prints out the data. With this change "Z" would be added to the end of the string if the timezone is defined.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] GH-30117: [C++][Python] Add "Z" to the end of timestamp print string when tz defined [arrow]

Posted by "AlenkaF (via GitHub)" <gi...@apache.org>.
AlenkaF commented on PR #39272:
URL: https://github.com/apache/arrow/pull/39272#issuecomment-1862523793

   > >Given this is a completely separate code path (this is something defined in the python repr, not the C++ pretty printer), let's keep that for a separate PR? (but yes, I agree it would be good to do that!)
   
   >Agreed, let's make another issue for it.
   
   It would be good to add that to R also, if I am not mistaken. Will create an issue for it 👍 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] GH-30117: [C++][Python] Add "Z" to the end of timestamp print string when tz defined [arrow]

Posted by "rok (via GitHub)" <gi...@apache.org>.
rok commented on code in PR #39272:
URL: https://github.com/apache/arrow/pull/39272#discussion_r1430142902


##########
python/pyarrow/tests/test_types.py:
##########
@@ -487,6 +487,17 @@ def test_timestamp():
             pa.timestamp(invalid_unit)
 
 
+def test_timestamp_print():
+    for unit in ('s', 'ms', 'us', 'ns'):
+        for tz in ('UTC', 'Europe/Paris', 'Pacific/Marquesas',
+                   'Mars/Mariner_Valley', '-00:42', '+42:00'):
+            ty = pa.timestamp(unit, tz=tz)
+            arr = pa.array([0], ty)
+            assert "Z" in str(arr)
+        arr = pa.array([0])

Review Comment:
   ```suggestion
           arr = pa.array([0], pa.timestamp(unit))
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] GH-30117: [C++][Python] Add "Z" to the end of timestamp print string when tz defined [arrow]

Posted by "jorisvandenbossche (via GitHub)" <gi...@apache.org>.
jorisvandenbossche commented on code in PR #39272:
URL: https://github.com/apache/arrow/pull/39272#discussion_r1432426148


##########
cpp/src/arrow/util/formatting_util_test.cc:
##########
@@ -522,6 +522,50 @@ TEST(Formatting, Timestamp) {
     AssertFormatting(formatter, -2203932304LL * 1000000000LL + 8,
                      "1900-02-28 12:34:56.000000008");
   }
+
+  {
+    auto timestamp_types = {timestamp(TimeUnit::SECOND, "US/Eastern"),
+                            timestamp(TimeUnit::SECOND, "+01:00"),
+                            timestamp(TimeUnit::SECOND, "Mars/Mariner_Valley")};
+    for (auto ty : timestamp_types) {
+      StringFormatter<TimestampType> formatter(ty.get());
+
+      AssertFormatting(formatter, 0, "1970-01-01 00:00:00Z");
+    }
+  }
+
+  {
+    auto timestamp_types = {timestamp(TimeUnit::MILLI, "US/Eastern"),
+                            timestamp(TimeUnit::MILLI, "Pacific/Maruesas"),
+                            timestamp(TimeUnit::MILLI, "Mars/Mariner_Valley")};
+    for (auto ty : timestamp_types) {
+      StringFormatter<TimestampType> formatter(ty.get());
+
+      AssertFormatting(formatter, 0, "1970-01-01 00:00:00.000Z");
+    }
+  }

Review Comment:
   Personally, I don't think it is necessary to test multiple timezones for the other resolutions (there is nothing in the code that currently is tz-dependent, and it would still be caught by the 3 different timezones tested above for SECOND resolution). I would simplify this to just test a single one.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] GH-30117: [C++][Python] Add "Z" to the end of timestamp print string when tz defined [arrow]

Posted by "AlenkaF (via GitHub)" <gi...@apache.org>.
AlenkaF commented on code in PR #39272:
URL: https://github.com/apache/arrow/pull/39272#discussion_r1430179736


##########
python/pyarrow/tests/test_types.py:
##########
@@ -487,6 +487,17 @@ def test_timestamp():
             pa.timestamp(invalid_unit)
 
 
+def test_timestamp_print():
+    for unit in ('s', 'ms', 'us', 'ns'):
+        for tz in ('UTC', 'Europe/Paris', 'Pacific/Marquesas',
+                   'Mars/Mariner_Valley', '-00:42', '+42:00'):
+            ty = pa.timestamp(unit, tz=tz)
+            arr = pa.array([0], ty)
+            assert "Z" in str(arr)
+        arr = pa.array([0])

Review Comment:
   Oh, thanks for catching that!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] GH-30117: [C++][Python] Add "Z" to the end of timestamp print string when tz defined [arrow]

Posted by "jorisvandenbossche (via GitHub)" <gi...@apache.org>.
jorisvandenbossche commented on PR #39272:
URL: https://github.com/apache/arrow/pull/39272#issuecomment-1862359013

   >  does this currently print timestamp's type when printing the array? 
   
   Given this is a completely separate code path (this is something defined in the python repr, not the C++ pretty printer), let's keep that for a separate PR? 
   (but yes, I agree it would be good to do that!)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] GH-30117: [C++][Python] Add "Z" to the end of timestamp print string when tz defined [arrow]

Posted by "rok (via GitHub)" <gi...@apache.org>.
rok commented on PR #39272:
URL: https://github.com/apache/arrow/pull/39272#issuecomment-1862442433

   > Given this is a completely separate code path (this is something defined in the python repr, not the C++ pretty printer), let's keep that for a separate PR? (but yes, I agree it would be good to do that!)
   
   Agreed, let's make another issue for it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] GH-30117: [C++][Python] Add "Z" to the end of timestamp print string when tz defined [arrow]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #39272:
URL: https://github.com/apache/arrow/pull/39272#issuecomment-1859906733

   :warning: GitHub issue #30117 **has been automatically assigned in GitHub** to PR creator.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] GH-30117: [C++][Python] Add "Z" to the end of timestamp print string when tz defined [arrow]

Posted by "AlenkaF (via GitHub)" <gi...@apache.org>.
AlenkaF commented on PR #39272:
URL: https://github.com/apache/arrow/pull/39272#issuecomment-1864133192

   Opened an issue for printing timezone information in `TimestampArray`: https://github.com/apache/arrow/issues/39315
   cc @rok @jorisvandenbossche 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] GH-30117: [C++][Python] Add "Z" to the end of timestamp print string when tz defined [arrow]

Posted by "conbench-apache-arrow[bot] (via GitHub)" <gi...@apache.org>.
conbench-apache-arrow[bot] commented on PR #39272:
URL: https://github.com/apache/arrow/pull/39272#issuecomment-1881928064

   After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit a288364d971ab9a6a3f05a903a5df83ebeddf0a0.
   
   There were no benchmark performance regressions. 🎉
   
   The [full Conbench report](https://github.com/apache/arrow/runs/20280516047) has more details. It also includes information about 4 possible false positives for unstable benchmarks that are known to sometimes produce them.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] GH-30117: [C++][Python] Add "Z" to the end of timestamp print string when tz defined [arrow]

Posted by "rok (via GitHub)" <gi...@apache.org>.
rok commented on PR #39272:
URL: https://github.com/apache/arrow/pull/39272#issuecomment-1860830125

   > as mentioned on the issue, we can add this if find important.
   
   If there's an easy way to include timezone it would be great!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] GH-30117: [C++][Python] Add "Z" to the end of timestamp print string when tz defined [arrow]

Posted by "jorisvandenbossche (via GitHub)" <gi...@apache.org>.
jorisvandenbossche commented on PR #39272:
URL: https://github.com/apache/arrow/pull/39272#issuecomment-1881004496

   Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] GH-30117: [C++][Python] Add "Z" to the end of timestamp print string when tz defined [arrow]

Posted by "rok (via GitHub)" <gi...@apache.org>.
rok commented on PR #39272:
URL: https://github.com/apache/arrow/pull/39272#issuecomment-1860710122

   Forgot to ask: does this currently print timestamp's type when printing the array? And if it does can you add a test that it prints the correct timezone. If not we can leave it for the follow-up.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] GH-30117: [C++][Python] Add "Z" to the end of timestamp print string when tz defined [arrow]

Posted by "jorisvandenbossche (via GitHub)" <gi...@apache.org>.
jorisvandenbossche merged PR #39272:
URL: https://github.com/apache/arrow/pull/39272


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] GH-30117: [C++][Python] Add "Z" to the end of timestamp print string when tz defined [arrow]

Posted by "AlenkaF (via GitHub)" <gi...@apache.org>.
AlenkaF commented on code in PR #39272:
URL: https://github.com/apache/arrow/pull/39272#discussion_r1432451906


##########
cpp/src/arrow/util/formatting_util_test.cc:
##########
@@ -522,6 +522,50 @@ TEST(Formatting, Timestamp) {
     AssertFormatting(formatter, -2203932304LL * 1000000000LL + 8,
                      "1900-02-28 12:34:56.000000008");
   }
+
+  {
+    auto timestamp_types = {timestamp(TimeUnit::SECOND, "US/Eastern"),
+                            timestamp(TimeUnit::SECOND, "+01:00"),
+                            timestamp(TimeUnit::SECOND, "Mars/Mariner_Valley")};
+    for (auto ty : timestamp_types) {
+      StringFormatter<TimestampType> formatter(ty.get());
+
+      AssertFormatting(formatter, 0, "1970-01-01 00:00:00Z");
+    }
+  }
+
+  {
+    auto timestamp_types = {timestamp(TimeUnit::MILLI, "US/Eastern"),
+                            timestamp(TimeUnit::MILLI, "Pacific/Maruesas"),
+                            timestamp(TimeUnit::MILLI, "Mars/Mariner_Valley")};
+    for (auto ty : timestamp_types) {
+      StringFormatter<TimestampType> formatter(ty.get());
+
+      AssertFormatting(formatter, 0, "1970-01-01 00:00:00.000Z");
+    }
+  }

Review Comment:
   I think I got it now =)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] GH-30117: [C++][Python] Add "Z" to the end of timestamp print string when tz defined [arrow]

Posted by "jorisvandenbossche (via GitHub)" <gi...@apache.org>.
jorisvandenbossche commented on code in PR #39272:
URL: https://github.com/apache/arrow/pull/39272#discussion_r1431095106


##########
cpp/src/arrow/util/formatting_util_test.cc:
##########
@@ -522,6 +522,28 @@ TEST(Formatting, Timestamp) {
     AssertFormatting(formatter, -2203932304LL * 1000000000LL + 8,
                      "1900-02-28 12:34:56.000000008");
   }
+
+  {
+    auto timestamp_types = {timestamp(TimeUnit::SECOND, "US/Eastern"),
+                            timestamp(TimeUnit::SECOND, "+01:00"),
+                            timestamp(TimeUnit::SECOND, "-42:00"),
+                            timestamp(TimeUnit::SECOND, "Pacific/Maruesas"),
+                            timestamp(TimeUnit::SECOND, "Mars/Mariner_Valley")};

Review Comment:
   Can you also test with different resolutions?
   
   (it might not be needed to test the whole list of values below for each type, we maybe can also just test one of them per resolution/tz outside of a loop, if that makes things simpler)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] GH-30117: [C++][Python] Add "Z" to the end of timestamp print string when tz defined [arrow]

Posted by "rok (via GitHub)" <gi...@apache.org>.
rok commented on code in PR #39272:
URL: https://github.com/apache/arrow/pull/39272#discussion_r1430030116


##########
python/pyarrow/tests/test_types.py:
##########
@@ -487,6 +487,16 @@ def test_timestamp():
             pa.timestamp(invalid_unit)
 
 
+def test_timestamp_print():
+    for unit in ('s', 'ms', 'us', 'ns'):
+        for tz in ('UTC', 'Europe/Paris'):

Review Comment:
   ```suggestion
           for tz in ('UTC', 'Europe/Paris', 'Pacific/Marquesas', 'Mars/Mariner_Valley', '-00:42', '+42:00'):
   ```



##########
cpp/src/arrow/util/formatting_util_test.cc:
##########
@@ -522,6 +522,22 @@ TEST(Formatting, Timestamp) {
     AssertFormatting(formatter, -2203932304LL * 1000000000LL + 8,
                      "1900-02-28 12:34:56.000000008");
   }
+
+  {
+    auto ty = timestamp(TimeUnit::SECOND, "US/Eastern");
+    StringFormatter<TimestampType> formatter(ty.get());
+
+    AssertFormatting(formatter, 0, "1970-01-01 00:00:00Z");
+    AssertFormatting(formatter, 1, "1970-01-01 00:00:01Z");
+    AssertFormatting(formatter, 24 * 60 * 60, "1970-01-02 00:00:00Z");
+    AssertFormatting(formatter, 616377600, "1989-07-14 00:00:00Z");
+    AssertFormatting(formatter, 951782400, "2000-02-29 00:00:00Z");
+    AssertFormatting(formatter, 63730281600LL, "3989-07-14 00:00:00Z");
+    AssertFormatting(formatter, -2203977600LL, "1900-02-28 00:00:00Z");
+
+    AssertFormatting(formatter, 1542129070, "2018-11-13 17:11:10Z");
+    AssertFormatting(formatter, -2203932304LL, "1900-02-28 12:34:56Z");
+  }

Review Comment:
   ```suggestion
   
     {
       auto timestamp_types = {timestamp(TimeUnit::SECOND, "US/Eastern"),
                             timestamp(TimeUnit::SECOND, "+01:00"),
                             timestamp(TimeUnit::SECOND, "-42:00"),
                             timestamp(TimeUnit::SECOND, "Pacific/Maruesas"),
                             timestamp(TimeUnit::SECOND, "Mars/Mariner_Valley")};
       for (auto ty : timestamp_types) {
         auto ty = timestamp(TimeUnit::SECOND, "US/Eastern");
         StringFormatter<TimestampType> formatter(ty.get());
     
         AssertFormatting(formatter, 0, "1970-01-01 00:00:00Z");
         AssertFormatting(formatter, 1, "1970-01-01 00:00:01Z");
         AssertFormatting(formatter, 24 * 60 * 60, "1970-01-02 00:00:00Z");
         AssertFormatting(formatter, 616377600, "1989-07-14 00:00:00Z");
         AssertFormatting(formatter, 951782400, "2000-02-29 00:00:00Z");
         AssertFormatting(formatter, 63730281600LL, "3989-07-14 00:00:00Z");
         AssertFormatting(formatter, -2203977600LL, "1900-02-28 00:00:00Z");
     
         AssertFormatting(formatter, 1542129070, "2018-11-13 17:11:10Z");
         AssertFormatting(formatter, -2203932304LL, "1900-02-28 12:34:56Z");
       }
     }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] GH-30117: [C++][Python] Add "Z" to the end of timestamp print string when tz defined [arrow]

Posted by "AlenkaF (via GitHub)" <gi...@apache.org>.
AlenkaF commented on PR #39272:
URL: https://github.com/apache/arrow/pull/39272#issuecomment-1860738914

   > Forgot to ask: does this currently print timestamp's type when printing the array? And if it does can you add a test that it prints the correct timezone. If not we can leave it for the follow-up.
   
   No, when printing the array there is no info about the type, one should still inspect the type separately:
   
   ```python
   >>> import pyarrow as pa
   >>> a = pa.array([0], pa.timestamp('s', tz='+02:00'))
   >>> a
   <pyarrow.lib.TimestampArray object at 0x131399fc0>
   [
     1970-01-01 00:00:00Z
   ]
   
   >>> a.type
   TimestampType(timestamp[s, tz=+02:00])
   ```
   
   as mentioned on the issue, we can add this if find important.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org