You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Erik Erlandson (JIRA)" <ji...@apache.org> on 2019/07/15 23:21:00 UTC

[jira] [Commented] (AVRO-2474) Support a "unit" property of schema fields

    [ https://issues.apache.org/jira/browse/AVRO-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16885691#comment-16885691 ] 

Erik Erlandson commented on AVRO-2474:
--------------------------------------

(copied from email thread)

 
Regarding schema, my proposal for fingerprints would be that units are fingerprinted based on their canonical form, as [defined here|http://erikerlandson.github.io/blog/2019/05/03/algorithmic-unit-analysis/]. Any two unit expressions having the same canonical form (including the corresponding coefficients) are exactly equivalent, and so their fingerprints can be the same. Possibly the unit could be stored on the schema in canonical form by convention, although canonical forms are frequently not as intuitive to humans and so in that case the documentation value of the unit might be reduced for humans examining the schema.
 
For schema evolution, a unit change such that the previous and new unit are convertable (also defined as at the above link) would be well defined, and automatic transformation would just be the correct unit conversion (e.g. seconds to milliseconds). If the unit changes to a non-convertable unit (e.g. seconds to bytes) then no automatic transformation exists, and attempting to resolve the old and new schema would be an error. Note that establishing the conversion assumes that both original and new schemas are  available at read time.
 

> Support a "unit" property of schema fields
> ------------------------------------------
>
>                 Key: AVRO-2474
>                 URL: https://issues.apache.org/jira/browse/AVRO-2474
>             Project: Apache Avro
>          Issue Type: Improvement
>          Components: spec
>    Affects Versions: 1.9.0
>            Reporter: Erik Erlandson
>            Priority: Major
>
> Recently I have been experimenting with avro schema that are extended with a "unit" field. By "unit" I mean expressions like "second", or "megabyte" - that is "units of measure".
>  
> I received some community interest in making this concept "first class" for avro; I'm filing this JIRA to track the idea. 
>  
> I delivered a short talk on my experiments at Berlin Buzzwords, which can be viewed here:
> [https://www.youtube.com/watch?v=qrQmB2KFKE8]
>  
> I also wrote a short blog post that may be faster to ingest:
> [http://erikerlandson.github.io/blog/2019/05/23/unit-types-for-avro-schema-integrating-avro-with-coulomb/]
>  
> The project itself is here:
> [https://github.com/erikerlandson/coulomb]
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)