You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Andy Coates (JIRA)" <ji...@apache.org> on 2017/09/26 16:00:00 UTC

[jira] [Commented] (AVRO-1721) Should LogicalTypes introduce schema (in)compatibility and canonical parsing form changes?

    [ https://issues.apache.org/jira/browse/AVRO-1721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16180991#comment-16180991 ] 

Andy Coates commented on AVRO-1721:
-----------------------------------

Hey Ryan, 

I'm not sure we can ignore logical types such as decimal, that contain scale and precision information. If this information changes then the in-memory representation is just plain wrong. e.g. if you serialise "new BigDecimal("1.2345")" with scale=4 and then deserialize with a read schema that has a scale of 3, then the result is "new BigDecimal("12.345")" i.e. the wrong value! 

So, while I'm not sure at this time what the right approach is, I don't think we can ignore logical types that store additional state from compatibility and canonicals.

> Should LogicalTypes introduce schema (in)compatibility and canonical parsing form changes?
> ------------------------------------------------------------------------------------------
>
>                 Key: AVRO-1721
>                 URL: https://issues.apache.org/jira/browse/AVRO-1721
>             Project: Avro
>          Issue Type: Improvement
>          Components: spec
>    Affects Versions: 1.8.0
>            Reporter: Bob Cotton
>
> During a recent spike of integrating LogcialTypes into our Avro
> wrapper we encountered the the following questions.
> 1. Is the addition/removal of a logical to a schema element a backward
> breaking change?
> 2. Should the canonical parsing form include logical type information?
> I understand that the underlying base Avro types are not changing with
> the introduction of LogicalTypes. The raw serialized data will be the
> same. However the client code dependent on the deserialization may be
> subject to breakage.
> Let me elaborate on these.
> 1. Is the addition/removal of a logical to a schema element a backward
> breaking change?
> Take for example the UUID logical type. At least in the case of
> GenericData, if I change a schema element from a string to a UUID and
> I have Converters turned on, existing client code that is expecting a
> String to be returned will now have a runtime exception when an
> instance of UUID is suddenly returned.
> From the client's perspective I've just change the underlying type of
> the element.
> 2. Should the canonical parsing form (CPF) include logical type information?
> If the answer to #1 is yes, then the CPF should also include the
> logical type information.
> We were wondering if there might be a slightly less strict form of
> schema "normalization" and fingerprinting. Currently the
> fingerprinting process is based on the CPF. It would be interesting to
> introduce the "Normal Parsing Form" (NPF) which retains all the
> optional information contained within a schema, but in a normal or
> regular way. That way a fingerprint could be determined without having
> to script possibly important information, like the LogicalType info.
> Interested in your thoughts on these questions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)