You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2021/09/29 07:45:08 UTC

[GitHub] [beam] blackhogz commented on a change in pull request #15485: [BEAM-10655] Fix conversion of NanosInstant to BigQuery Timestamp

blackhogz commented on a change in pull request #15485:
URL: https://github.com/apache/beam/pull/15485#discussion_r718240426



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryUtils.java
##########
@@ -551,9 +548,9 @@ public static TableRow toTableRow(Row row) {
         return toTableRow((Row) fieldValue);
 
       case DATETIME:
-        return ((Instant) fieldValue)
-            .toDateTime(DateTimeZone.UTC)
-            .toString(BIGQUERY_TIMESTAMP_PRINTER);
+        org.joda.time.Instant jodaInstant = (org.joda.time.Instant) fieldValue;
+        java.time.Instant javaInstant = java.time.Instant.ofEpochMilli(jodaInstant.getMillis());
+        return BIGQUERY_TIMESTAMP_PRINTER.format(javaInstant);

Review comment:
       Thanks @TheNeuralBit . I'm working together with @amuletxheart and also looking to see if I can be of any help.
   
   With the opt-in flag, would this below approach be a reasonable venue to proceed?
   
   - add a `BigQueryIO.Write#allowTruncatedTimestamps()` method for explicit opt-in (i.e. default false)
   - pass the value to `BigQueryUtils.toTableRow()` as another parameter at the call site [here](https://github.com/apache/beam/blob/0111cff88025f0dc783a0890078b769139c8ae36/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L2687), and `BigQueryUtils.toTableSchema()` [here](https://github.com/apache/beam/blob/0111cff88025f0dc783a0890078b769139c8ae36/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L2690)
   - Update `BigQueryUtils.toTableSchema()` to only accept NanosInstant logical type if the `allowTruncatedTimestamps` parameter passed in is true. This should reject, i.e. error out, before any rows are processed, i.e. before formatFunction is triggered. In fact, with this, I'm thinking we don't even need to pass `allowTruncatedTimestamps` to `BigQueryUtils.toTableRow()` any more.
   
   Please let us know what do you think? @TheNeuralBit and other maintainers. Thanks a lot!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org