You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "luke de feo (Jira)" <ji...@apache.org> on 2021/03/01 11:07:00 UTC
[jira] [Commented] (BEAM-9502) SchemaCoder is not update compatible
[ https://issues.apache.org/jira/browse/BEAM-9502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292822#comment-17292822 ]
luke de feo commented on BEAM-9502:
-----------------------------------
Hello is there any update on this?
> SchemaCoder is not update compatible
> ------------------------------------
>
> Key: BEAM-9502
> URL: https://issues.apache.org/jira/browse/BEAM-9502
> Project: Beam
> Issue Type: Bug
> Components: runner-dataflow, sdk-java-core
> Reporter: Yaron Neuman
> Priority: P3
> Labels: Clarified, stale-assigned
> Time Spent: 2h
> Remaining Estimate: 0h
>
> See relevant [dev@ discussion|https://lists.apache.org/thread.html/r720682624251222c2f3140c785463c0ddfdd782fc88d5d2a48a464c7%40%3Cdev.beam.apache.org%3E]. Runners should consider schemas compatible if they have the same fields in the same order. They should also get the ability to re-order fields in equivalent schemas (same fields, possibly out of order) using the encoding_position field (BEAM-10277).
> Original Description:
> h2. SchemaCoder assigns random UUID, causes Dataflow's compatibility check to fail
> After fe4b7794, _Schema.equals_ comparing only the UUIDs for faster comparison.
> After 0b3b18c6 _SchemaCoder_ forcing random UUID when schema.uuid is null.
> thus, when trying to update (--update) a Dataflow job with row schemas in user-code, the compatibility check will fail because SchemaCoder produce another random UUID.
>
> The user can set the UUID after creating the Schema, but not with Schema.Builder
> and I'm afraid most users, that are not aware to the internal implementation, won't do that.
>
> In my branch, I added _.withUUID_ and _.withRandomUUID_ to _Schema.Builder_
> But I think a better solution will be to calculate the UUID based on the schema itself.
> any thoughts?
> [~reuvenlax]
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)