You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Ryan Skraba (Jira)" <ji...@apache.org> on 2021/09/30 07:57:00 UTC

[jira] [Commented] (AVRO-2899) JsonEncoder writes type information for not-null union

    [ https://issues.apache.org/jira/browse/AVRO-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422590#comment-17422590 ] 

Ryan Skraba commented on AVRO-2899:
-----------------------------------

I linked the equivalent JIRA about reading the union without the extra indirection of the type!  This is such a frequent question, and not very well documented.

Changing the JSON encoding is a specification change that would need some extra thought.  I'm definitely not against making a change that makes the JSON format "less surprising", but we should prioritize users that depend on interoperability between languages and versions.

In practice, I try to avoid the JSON representation of Avro data and this is one of the reasons why.

> JsonEncoder writes type information for not-null union
> ------------------------------------------------------
>
>                 Key: AVRO-2899
>                 URL: https://issues.apache.org/jira/browse/AVRO-2899
>             Project: Apache Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.9.2
>            Reporter: Werner Daehn
>            Priority: Major
>
> _Summary: A union of [null,string] should output the value as such when using the JsonEncoder. To accomplish that, a single line needs to be changed in the JsonEncoder.java. I don't believe there are side effects but not sure - looking for validation._
>  
> When the schema looks like 
> {{name: "text", type: ["null",\{"type":"string"}] }}
> the JsonEncoder creates a Json object explicitly stating the type. So the created json is
> {{text: \{ "string": "Hello World" }}}
> instead of
> {{text: "Hello World"}}
> I have searched for this issue and people complain frequently but no real resolution. 
> While I understand why this is done, the JsonDecoder needs to know what type to add in case of a union, it does not make sense in a not-null union where there is either a string value or not.
>  
> I would argue, in the simple case where there is a union of two elements and the first is the NULL symbol, the extra text can be omitted. Hence I have changed the writeIndex line in the JsonEncoder from
> [https://github.com/apache/avro/blob/c903aa6d6fc42d3c347f95d469a8364ea44165e8/lang/java/avro/src/main/java/org/apache/avro/io/JsonEncoder.java#L297]
> {{if (symbol != Symbol.NULL) {}}
> to
> {{if (symbol != Symbol.NULL && (top.symbols.length > 2 || top.getSymbol(0) != Symbol.NULL)) {}}
>  
> It should not have a side effect on the JsonDecoder either, as in case of schema evolution - making a field null-able - this must be resolved anyhow. I am not sure about that however.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)