You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@avro.apache.org by "Ryan Blue (JIRA)" <ji...@apache.org> on 2016/09/12 17:14:22 UTC

[jira] [Updated] (AVRO-1843) Clarify importance of writer's schema in documentation

     [ https://issues.apache.org/jira/browse/AVRO-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan Blue updated AVRO-1843:
----------------------------
    Fix Version/s:     (was: 1.8.2)
                       (was: 1.7.8)

> Clarify importance of writer's schema in documentation
> ------------------------------------------------------
>
>                 Key: AVRO-1843
>                 URL: https://issues.apache.org/jira/browse/AVRO-1843
>             Project: Avro
>          Issue Type: Improvement
>          Components: doc
>            Reporter: Shannon Carey
>            Priority: Critical
>             Fix For: 1.9.0
>
>
> I'll be submitting a PR with some improvements to the Java Getting Started page as well as the Specification which make it clearer that Avro must read all data with the writer's schema before converting it into the reader's schema and why, and explaining that's why the schema should be available next to serialized data. Currently, it's arguably too easy to misinterpret Avro as only requiring a single, reader's schema in order to read data while still following the resolution rules which make Avro seem similar to JSON (resolution by field name). For example, the Java API examples only appear to involve one schema, hiding the fact that it reads in the writer's schema implicitly. Also, the ability to serialize to JSON (where field names and some type info is present) makes this misconception easy to believe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)