You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/10/20 15:41:00 UTC

[jira] [Work logged] (BEAM-9502) SchemaCoder is not update compatible

     [ https://issues.apache.org/jira/browse/BEAM-9502?focusedWorklogId=502748&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-502748 ]

ASF GitHub Bot logged work on BEAM-9502:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 20/Oct/20 15:40
            Start Date: 20/Oct/20 15:40
    Worklog Time Spent: 10m 
      Work Description: TheNeuralBit commented on pull request #13049:
URL: https://github.com/apache/beam/pull/13049#issuecomment-712944106


   Let's keep BEAM-9502 open to track the overall problem of update compatibility in SchemaCoder


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 502748)
    Time Spent: 2h  (was: 1h 50m)

> SchemaCoder is not update compatible
> ------------------------------------
>
>                 Key: BEAM-9502
>                 URL: https://issues.apache.org/jira/browse/BEAM-9502
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-dataflow, sdk-java-core
>            Reporter: Yaron Neuman
>            Priority: P3
>              Labels: Clarified, stale-assigned
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> See relevant [dev@ discussion|https://lists.apache.org/thread.html/r720682624251222c2f3140c785463c0ddfdd782fc88d5d2a48a464c7%40%3Cdev.beam.apache.org%3E]. Runners should consider schemas compatible if they have the same fields in the same order. They should also get the ability to re-order fields in equivalent schemas (same fields, possibly out of order) using the encoding_position field (BEAM-10277).
> Original Description:
> h2. SchemaCoder assigns random UUID, causes Dataflow's compatibility check to fail
> After fe4b7794, _Schema.equals_ comparing only the UUIDs for faster comparison.
>  After 0b3b18c6 _SchemaCoder_ forcing random UUID when schema.uuid is null.
> thus, when trying to update (--update) a Dataflow job with row schemas in user-code, the compatibility check will fail because SchemaCoder produce another random UUID.
>  
> The user can set the UUID after creating the Schema, but not with Schema.Builder
>  and I'm afraid most users, that are not aware to the internal implementation, won't do that.
>  
> In my branch, I added _.withUUID_ and _.withRandomUUID_ to _Schema.Builder_
> But I think a better solution will be to calculate the UUID based on the schema itself.
> any thoughts?
> [~reuvenlax]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)