You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Andrew Pilloud (Jira)" <ji...@apache.org> on 2021/12/07 03:17:00 UTC

[jira] [Commented] (BEAM-13294) innerBroadcastJoin fails when batch column Nullable, streaming column not

    [ https://issues.apache.org/jira/browse/BEAM-13294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454348#comment-17454348 ] 

Andrew Pilloud commented on BEAM-13294:
---------------------------------------

The issue appears to occur here:
https://github.com/apache/beam/blob/243128a8fc52798e1b58b0cf1a271d95ee7aa241/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/transforms/CoGroup.java#L520

These involved schemas should all match, they don't:
Lookup key schema: STRING NOT NULL
Map key schema: STRING
Map value schema: STRING

Testing further, the "Map Key Schema" is nullable if either of the other two are. This appears to be a  widening bug.

> innerBroadcastJoin fails when batch column Nullable, streaming column not
> -------------------------------------------------------------------------
>
>                 Key: BEAM-13294
>                 URL: https://issues.apache.org/jira/browse/BEAM-13294
>             Project: Beam
>          Issue Type: Bug
>          Components: dsl-sql, dsl-sql-zetasql, sdk-java-core
>            Reporter: Andrew Pilloud
>            Assignee: Andrew Pilloud
>            Priority: P1
>
> The join always fail when given a batch column that is Nullable and a streaming column that is Not Nullable. This was discovered in BeamSideInputJoinRel but is likely an issue in the Beam core join library.
> Trivial reproduction with a small test diff:
> ./gradlew :runners:direct-java:needsRunnerTests --tests org.apache.beam.sdk.schemas.transforms.JoinTest.testInnerJoinDifferentKeys
> {code}
> --- a/sdks/java/core/src/test/java/org/apache/beam/sdk/schemas/transforms/JoinTest.java
> +++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/schemas/transforms/JoinTest.java
> @@ -51,7 +51,7 @@ public class JoinTest {
>            .build();
>    private static final Schema CG_SCHEMA_2 =
>        Schema.builder()
> - .addStringField("user2")
> + .addNullableField("user2", Schema.FieldType.STRING)
>            .addInt32Field("count2")
>            .addStringField("country2")
>            .build();
> {code}
> At a lower level:
> ./gradlew :runners:direct-java:needsRunnerTests --tests org.apache.beam.sdk.schemas.transforms.CoGroupTest.testCoGroupByDifferentFields
> {code}
> --- a/sdks/java/core/src/test/java/org/apache/beam/sdk/schemas/transforms/CoGroupTest.java
> +++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/schemas/transforms/CoGroupTest.java
> @@ -237,7 +237,7 @@ public class CoGroupTest {
>  
>    private static final Schema CG_SCHEMA_2 =
>        Schema.builder()
> -          .addStringField("user2")
> +          .addNullableField("user2", Schema.FieldType.STRING)
>            .addInt32Field("count2")
>            .addStringField("country2")
>            .build();
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)