You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2023/01/06 00:33:29 UTC

[GitHub] [beam] Abacn opened a new pull request, #24910: Attempt deserialize all non-standard logical types from proto

Abacn opened a new pull request, #24910:
URL: https://github.com/apache/beam/pull/24910

   Fixes #24870
   
   * Fixes portable and not-yet-standard logical type get deserialized to UnknownLogicalType
   
   **Please** add a meaningful description for your change here
   
   This was broken by #23014 in particular, prior to the change, SchemaTransform sets the URN all logical types not yet STANDARD to URN_BEAM_LOGICAL_JAVASDK when translating to proto
   
   https://github.com/apache/beam/blob/release-2.42.0/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/SchemaTranslation.java#L183
   
   after the change, it keeps URNs of portable logical types. This enables translating them from another SDK.
   
   https://github.com/apache/beam/blob/release-2.43.0/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/SchemaTranslation.java#L176
   
   But, when translating back, the code here did not change:
   
   https://github.com/apache/beam/blob/f9a86e5e2e3bbb345dad0953d7fe6a6b8ffe7a68/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/SchemaTranslation.java#L417
   
   Causing the deserializing not attempted and reduced to UnknownLogicalType
   
   Unfortunately Schema containing logical type with same URN but different class is not considered equal and is not caught by the unit test SchemaTranslationTest.FromProtoToProtoTest <- this could be a followup fix
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] Mention the appropriate issue in your description (for example: `addresses #123`), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment `fixes #<ISSUE NUMBER>` instead.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/get-started-contributing/#make-the-reviewers-job-easier).
   
   To check the build health, please visit [https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on pull request #24910: Attempt deserialize all non-standard logical types from proto

Posted by GitBox <gi...@apache.org>.
Abacn commented on PR #24910:
URL: https://github.com/apache/beam/pull/24910#issuecomment-1372981918

   There is a draft PR #23785 ran the use case of #24870 and succeeded: https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1_PR/158/testReport/org.apache.beam.sdk.io.gcp.bigquery/BigQueryIOJsonIT/testDateType/
   
   Ran without this change indeed failed: https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1_PR/156/testReport/org.apache.beam.sdk.io.gcp.bigquery/BigQueryIOJsonIT/testDateType/
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on pull request #24910: Attempt deserialize all non-standard portable logical types from proto

Posted by GitBox <gi...@apache.org>.
Abacn commented on PR #24910:
URL: https://github.com/apache/beam/pull/24910#issuecomment-1373991018

   > Will you open a cherry pick pull request as well?
   
   Thanks!
   
   opened #24925


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] kennknowles commented on a diff in pull request #24910: Attempt deserialize all non-standard portable logical types from proto

Posted by GitBox <gi...@apache.org>.
kennknowles commented on code in PR #24910:
URL: https://github.com/apache/beam/pull/24910#discussion_r1063490794


##########
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/SchemaTranslation.java:
##########
@@ -426,26 +431,33 @@ private static FieldType fieldTypeFromProtoWithoutNullable(SchemaApi.FieldType p
           return FieldType.DATETIME;
         } else if (urn.equals(URN_BEAM_LOGICAL_DECIMAL)) {
           return FieldType.DECIMAL;
-        } else if (urn.equals(URN_BEAM_LOGICAL_JAVASDK)) {
-          return FieldType.logicalType(
-              (LogicalType)
-                  SerializableUtils.deserializeFromByteArray(
-                      logicalType.getPayload().toByteArray(), "logicalType"));
-        } else {
-          @Nullable FieldType argumentType = null;
-          @Nullable Object argumentValue = null;
-          if (logicalType.hasArgumentType()) {
-            argumentType = fieldTypeFromProto(logicalType.getArgumentType());
-            argumentValue = fieldValueFromProto(argumentType, logicalType.getArgument());
+        } else if (urn.startsWith("beam:logical_type:")) {
+          try {
+            return FieldType.logicalType(
+                (LogicalType)
+                    SerializableUtils.deserializeFromByteArray(
+                        logicalType.getPayload().toByteArray(), "logicalType"));
+          } catch (IllegalArgumentException e) {
+            LOG.warn(
+                String.format(

Review Comment:
   `LOG.warn("Unable to deserialize the logical type {} from proto. Mark as UnknownLogicalType", urn)`



##########
sdks/java/core/src/test/java/org/apache/beam/sdk/schemas/SchemaTranslationTest.java:
##########
@@ -395,6 +402,41 @@ public void typeInfoNotSet() {
     }
   }
 
+  @RunWith(JUnit4.class)
+  public static class LogicalTypesTest {
+    @Test
+    public void testPortableLogicalTypeSerializeDeserilizeCorrectly() {
+      List<Schema.FieldType> testCases =
+          ImmutableList.<Schema.FieldType>builder()
+              .add(FieldType.logicalType(SqlTypes.DATE))
+              .add(FieldType.logicalType(SqlTypes.TIME))
+              .add(FieldType.logicalType(SqlTypes.DATETIME))
+              .add(FieldType.logicalType(SqlTypes.TIMESTAMP))
+              .add(FieldType.logicalType(new NanosInstant()))
+              .add(FieldType.logicalType(new NanosDuration()))
+              .add(FieldType.logicalType(FixedBytes.of(10)))
+              .add(FieldType.logicalType(VariableBytes.of(10)))
+              .add(FieldType.logicalType(FixedString.of(10)))
+              .add(FieldType.logicalType(VariableString.of(10)))
+              .add(FieldType.logicalType(FixedPrecisionNumeric.of(10)))
+              .build();
+
+      for (Schema.FieldType fieldType : testCases) {

Review Comment:
   You can do this with `@RunWith(Parameterized.class)` as a cleanup.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] kennknowles commented on pull request #24910: Attempt deserialize all non-standard portable logical types from proto

Posted by GitBox <gi...@apache.org>.
kennknowles commented on PR #24910:
URL: https://github.com/apache/beam/pull/24910#issuecomment-1373977984

   Will you open a cherry pick pull request as well?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] github-actions[bot] commented on pull request #24910: Attempt deserialize all non-standard portable logical types from proto

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #24910:
URL: https://github.com/apache/beam/pull/24910#issuecomment-1373060097

   Assigning reviewers. If you would like to opt out of this review, comment `assign to next reviewer`:
   
   R: @apilloud for label java.
   
   Available commands:
   - `stop reviewer notifications` - opt out of the automated review tooling
   - `remind me after tests pass` - tag the comment author after tests pass
   - `waiting on author` - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)
   
   The PR bot will only process comments in the main thread (not review comments).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on pull request #24910: Attempt deserialize all non-standard portable logical types from proto

Posted by GitBox <gi...@apache.org>.
Abacn commented on PR #24910:
URL: https://github.com/apache/beam/pull/24910#issuecomment-1373067236

   CC: @ahmedabu98 
   CC: @kennknowles 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on pull request #24910: Attempt deserialize all non-standard portable logical types from proto

Posted by GitBox <gi...@apache.org>.
Abacn commented on PR #24910:
URL: https://github.com/apache/beam/pull/24910#issuecomment-1373883158

   Run Java PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on pull request #24910: Attempt deserialize all non-standard portable logical types from proto

Posted by GitBox <gi...@apache.org>.
Abacn commented on PR #24910:
URL: https://github.com/apache/beam/pull/24910#issuecomment-1373973108

   all test succeeded, PTAL @kennknowles thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] kennknowles commented on pull request #24910: Attempt deserialize all non-standard portable logical types from proto

Posted by GitBox <gi...@apache.org>.
kennknowles commented on PR #24910:
URL: https://github.com/apache/beam/pull/24910#issuecomment-1373977337

   Nice!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] kennknowles merged pull request #24910: Attempt deserialize all non-standard portable logical types from proto

Posted by GitBox <gi...@apache.org>.
kennknowles merged PR #24910:
URL: https://github.com/apache/beam/pull/24910


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org