You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Brian Hulette (Jira)" <ji...@apache.org> on 2020/10/20 15:36:00 UTC

[jira] [Updated] (BEAM-9502) SchemaCoder assigns random UUID, causes Dataflow's compatibility check to fail

     [ https://issues.apache.org/jira/browse/BEAM-9502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brian Hulette updated BEAM-9502:
--------------------------------
    Description: 


Original Description:

h2. SchemaCoder assigns random UUID, causes Dataflow's compatibility check to fail
After fe4b7794, _Schema.equals_ comparing only the UUIDs for faster comparison.
 After 0b3b18c6 _SchemaCoder_ forcing random UUID when schema.uuid is null.

thus, when trying to update (--update) a Dataflow job with row schemas in user-code, the compatibility check will fail because SchemaCoder produce another random UUID.

 

The user can set the UUID after creating the Schema, but not with Schema.Builder
 and I'm afraid most users, that are not aware to the internal implementation, won't do that.

 

In my branch, I added _.withUUID_ and _.withRandomUUID_ to _Schema.Builder_

But I think a better solution will be to calculate the UUID based on the schema itself.

any thoughts?

[~reuvenlax]

 

  was:
After fe4b7794, _Schema.equals_ comparing only the UUIDs for faster comparison.
 After 0b3b18c6 _SchemaCoder_ forcing random UUID when schema.uuid is null.

thus, when trying to update (--update) a Dataflow job with row schemas in user-code, the compatibility check will fail because SchemaCoder produce another random UUID.

 

The user can set the UUID after creating the Schema, but not with Schema.Builder
 and I'm afraid most users, that are not aware to the internal implementation, won't do that.

 

In my branch, I added _.withUUID_ and _.withRandomUUID_ to _Schema.Builder_

But I think a better solution will be to calculate the UUID based on the schema itself.

any thoughts?

[~reuvenlax]

 


> SchemaCoder assigns random UUID, causes Dataflow's compatibility check to fail
> ------------------------------------------------------------------------------
>
>                 Key: BEAM-9502
>                 URL: https://issues.apache.org/jira/browse/BEAM-9502
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-dataflow, sdk-java-core
>            Reporter: Yaron Neuman
>            Priority: P3
>              Labels: Clarified, stale-assigned
>          Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Original Description:
> h2. SchemaCoder assigns random UUID, causes Dataflow's compatibility check to fail
> After fe4b7794, _Schema.equals_ comparing only the UUIDs for faster comparison.
>  After 0b3b18c6 _SchemaCoder_ forcing random UUID when schema.uuid is null.
> thus, when trying to update (--update) a Dataflow job with row schemas in user-code, the compatibility check will fail because SchemaCoder produce another random UUID.
>  
> The user can set the UUID after creating the Schema, but not with Schema.Builder
>  and I'm afraid most users, that are not aware to the internal implementation, won't do that.
>  
> In my branch, I added _.withUUID_ and _.withRandomUUID_ to _Schema.Builder_
> But I think a better solution will be to calculate the UUID based on the schema itself.
> any thoughts?
> [~reuvenlax]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)