You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by "ahmedabu98 (via GitHub)" <gi...@apache.org> on 2023/02/14 20:39:22 UTC

[GitHub] [beam] ahmedabu98 opened a new pull request, #25472: Handle Timestamp fields with UTC in Storage writes

ahmedabu98 opened a new pull request, #25472:
URL: https://github.com/apache/beam/pull/25472

   Allow timestamp fields with UTC at the end
   
   e.g. "2023-02-14 12:00:00.123 UTC"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] github-actions[bot] commented on pull request #25472: One formatter for Timestamp fields in Storage writes

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #25472:
URL: https://github.com/apache/beam/pull/25472#issuecomment-1435238900

   Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment `assign set of reviewers`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] johnjcasey commented on pull request #25472: One formatter for Timestamp fields in Storage writes

Posted by "johnjcasey (via GitHub)" <gi...@apache.org>.
johnjcasey commented on PR #25472:
URL: https://github.com/apache/beam/pull/25472#issuecomment-1433665586

   LGTM


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] ahmedabu98 commented on pull request #25472: One formatter for Timestamp fields in Storage writes

Posted by "ahmedabu98 (via GitHub)" <gi...@apache.org>.
ahmedabu98 commented on PR #25472:
URL: https://github.com/apache/beam/pull/25472#issuecomment-1431961534

   @johnjcasey, @an2x read for review


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] apilloud commented on pull request #25472: One formatter for Timestamp fields in Storage writes

Posted by "apilloud (via GitHub)" <gi...@apache.org>.
apilloud commented on PR #25472:
URL: https://github.com/apache/beam/pull/25472#issuecomment-1442470851

   R: @johnjcasey @an2x


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] ahmedabu98 commented on a diff in pull request #25472: Handle Timestamp fields with UTC in Storage writes

Posted by "ahmedabu98 (via GitHub)" <gi...@apache.org>.
ahmedabu98 commented on code in PR #25472:
URL: https://github.com/apache/beam/pull/25472#discussion_r1106416825


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/TableRowToStorageApiProto.java:
##########
@@ -746,6 +755,12 @@ private static void fieldDescriptorFromTableField(
               return ChronoUnit.MICROS.between(
                   Instant.EPOCH, Instant.ofEpochMilli(Long.parseLong((String) value)));
             } catch (NumberFormatException e2) {
+              if (((String) value).endsWith("UTC")) {

Review Comment:
   that was already the current implementation, will change to have one date time formatter to parse from.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] an2x commented on a diff in pull request #25472: One formatter for Timestamp fields in Storage writes

Posted by "an2x (via GitHub)" <gi...@apache.org>.
an2x commented on code in PR #25472:
URL: https://github.com/apache/beam/pull/25472#discussion_r1107262343


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/TableRowToStorageApiProto.java:
##########
@@ -82,6 +82,17 @@ public class TableRowToStorageApiProto {
           .append(DateTimeFormatter.ISO_LOCAL_DATE)
           .appendLiteral(' ')
           .append(DateTimeFormatter.ISO_LOCAL_TIME)
+          .optionalStart()
+          .appendLiteral(" UTC")

Review Comment:
   I think we should support any time zone here, not just UTC.
   Even better, support all formats listed in [BigQuery documentation](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#canonical_format_for_timestamp_literals), i.e.:
   * 2014-09-27T12:30:00.45Z
   * 2014-09-27 12:30:00.45-8:00
   * 2014-09-27 12:30:00.123456 UTC
   * 2014-09-27 12:30:00.123456 America/Los_Angeles
   Note that in all 4 formats BQ accepts either space or "T" as the date/time separator. And as for the datetime/time zone separator: for "Z" and "-8:00" time zone format there must be no space before the time zone, and for "UTC" and "America/Los_Angeles" format there must be a space before.



##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/TableRowToStorageApiProto.java:
##########
@@ -82,6 +82,17 @@ public class TableRowToStorageApiProto {
           .append(DateTimeFormatter.ISO_LOCAL_DATE)
           .appendLiteral(' ')
           .append(DateTimeFormatter.ISO_LOCAL_TIME)
+          .optionalStart()
+          .appendLiteral(" UTC")

Review Comment:
   I think we should support any time zone here, not just UTC.
   Even better, support all formats listed in [BigQuery documentation](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#canonical_format_for_timestamp_literals), i.e.:
   * 2014-09-27T12:30:00.45Z
   * 2014-09-27 12:30:00.45-8:00
   * 2014-09-27 12:30:00.123456 UTC
   * 2014-09-27 12:30:00.123456 America/Los_Angeles
   
   Note that in all 4 formats BQ accepts either space or "T" as the date/time separator. And as for the datetime/time zone separator: for "Z" and "-8:00" time zone format there must be no space before the time zone, and for "UTC" and "America/Los_Angeles" format there must be a space before.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] github-actions[bot] commented on pull request #25472: One formatter for Timestamp fields in Storage writes

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #25472:
URL: https://github.com/apache/beam/pull/25472#issuecomment-1442460809

   Assigning reviewers. If you would like to opt out of this review, comment `assign to next reviewer`:
   
   R: @apilloud for label java.
   R: @johnjcasey for label io.
   
   Available commands:
   - `stop reviewer notifications` - opt out of the automated review tooling
   - `remind me after tests pass` - tag the comment author after tests pass
   - `waiting on author` - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)
   
   The PR bot will only process comments in the main thread (not review comments).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] github-actions[bot] commented on pull request #25472: One formatter for Timestamp fields in Storage writes

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #25472:
URL: https://github.com/apache/beam/pull/25472#issuecomment-1442472012

   Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] johnjcasey commented on a diff in pull request #25472: Handle Timestamp fields with UTC in Storage writes

Posted by "johnjcasey (via GitHub)" <gi...@apache.org>.
johnjcasey commented on code in PR #25472:
URL: https://github.com/apache/beam/pull/25472#discussion_r1106354913


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/TableRowToStorageApiProto.java:
##########
@@ -746,6 +755,12 @@ private static void fieldDescriptorFromTableField(
               return ChronoUnit.MICROS.between(
                   Instant.EPOCH, Instant.ofEpochMilli(Long.parseLong((String) value)));
             } catch (NumberFormatException e2) {
+              if (((String) value).endsWith("UTC")) {

Review Comment:
   we should not catch an exception to determine we should use an alternate date time formatter



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] ahmedabu98 commented on pull request #25472: One formatter for Timestamp fields in Storage writes

Posted by "ahmedabu98 (via GitHub)" <gi...@apache.org>.
ahmedabu98 commented on PR #25472:
URL: https://github.com/apache/beam/pull/25472#issuecomment-1430425701

   R: @johnjcasey PTAL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] ahmedabu98 commented on a diff in pull request #25472: One formatter for Timestamp fields in Storage writes

Posted by "ahmedabu98 (via GitHub)" <gi...@apache.org>.
ahmedabu98 commented on code in PR #25472:
URL: https://github.com/apache/beam/pull/25472#discussion_r1107665737


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/TableRowToStorageApiProto.java:
##########
@@ -80,11 +81,32 @@ public class TableRowToStorageApiProto {
   private static final DateTimeFormatter DATETIME_SPACE_FORMATTER =
       new DateTimeFormatterBuilder()
           .append(DateTimeFormatter.ISO_LOCAL_DATE)
+          .optionalStart()
           .appendLiteral(' ')
+          .optionalEnd()
+          .optionalStart()
+          .appendLiteral('T')
+          .optionalEnd()

Review Comment:
   FYI this allows for timestamp fields in TableRows to look like this:
   1. `2000-02-13T12:30:00`
   2. `2000-02-13 12:30:00`
   3. `2000-02-13 T12:30:00`
   4. `2000-02-1312:30:00`
   
   All of these would be handled the same way so I don't think there is much concern. If there's an issue however, I can split up the DATETIME_SPACE_FORMATTER so only 1) and 2) are valid



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] johnjcasey merged pull request #25472: One formatter for Timestamp fields in Storage writes

Posted by "johnjcasey (via GitHub)" <gi...@apache.org>.
johnjcasey merged PR #25472:
URL: https://github.com/apache/beam/pull/25472


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org