You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Brian Hulette (Jira)" <ji...@apache.org> on 2021/09/21 21:22:00 UTC

[jira] [Commented] (BEAM-12921) PAssert ignore the Schema fields names for testing

    [ https://issues.apache.org/jira/browse/BEAM-12921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17418325#comment-17418325 ] 

Brian Hulette commented on BEAM-12921:
--------------------------------------

Thanks for reporting this! I'll see if I can reproduce it. I double-checked and Row.equals() _does_ check that Schemas are equal: https://github.com/apache/beam/blob/587f58e1a1c3f5caa14f9b47144b5a5296cff9b5/sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java#L435

Perhaps something odd is happening with coders here.

> PAssert ignore the Schema fields names for testing  
> ----------------------------------------------------
>
>                 Key: BEAM-12921
>                 URL: https://issues.apache.org/jira/browse/BEAM-12921
>             Project: Beam
>          Issue Type: Bug
>          Components: dsl-sql
>            Reporter: Sanil Jain
>            Priority: P2
>
> Found this bug while testing Select operator that FieldName gets ignored by Passert here, this code passes
> beam version 2.26.0.8
> ```
> {code:java}
> private static final Schema APP_SCHEMA = Schema.builder()
>     .addInt32Field("appId")
>     .addStringField("description")
>     .addFloatField("rating")
>     .build();
> @Test
> public void testProjectOperator(){
> PCollection<Row> projectedOutput = generateTestRow(pipeline).apply(Select.fieldNames("appId", "description"));
> // Modified schema with renamed field
> Schema modifiedSchema = Schema.builder()
>     .addInt32Field("appId")
>     .addStringField("randomName")// this should ideally break
>     .build();
> PAssert.that(projectedOutput).containsInAnyOrder(
>     Row.withSchema(modifiedSchema).addValues(-8, "Invalid").build(),
>     Row.withSchema(modifiedSchema).addValues(0, "Invalid").build(),
>     Row.withSchema(modifiedSchema).addValues(1, "Recruiter").build(),
>     Row.withSchema(modifiedSchema).addValues(2, "Hirein").build(),
>     Row.withSchema(modifiedSchema).addValues(1, "Workplace").build()
> );
> pipeline.run().waitUntilFinish();
> }
> public static PCollection<Row> generateTestRow(Pipeline pipeline) {
>   // Create a concrete row with that type.
>   return PBegin
>           .in(pipeline)
>           .apply(Create.of(
>               Row.withSchema(APP_SCHEMA).addValues(-8, "Invalid", 0f).build(),
>               Row.withSchema(APP_SCHEMA).addValues(0, "Invalid", -1.1f).build(),
>               Row.withSchema(APP_SCHEMA).addValues(1, "Recruiter", 4.2f).build(),
>               Row.withSchema(APP_SCHEMA).addValues(2, "Hirein", 3.5f).build(),
>               Row.withSchema(APP_SCHEMA).addValues(1, "Workplace", 3f).build())
>           .withCoder(RowCoder.of(APP_SCHEMA)));
> }{code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)