You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Nikita Ryanov (Jira)" <ji...@apache.org> on 2019/09/08 10:37:00 UTC

[jira] [Updated] (AVRO-2539) ThriftData produces not compatible avro schemas

     [ https://issues.apache.org/jira/browse/AVRO-2539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nikita Ryanov updated AVRO-2539:
--------------------------------
    Description: 
Currently, ThrifdData class produces not compatible avro schema in terms of AvroCompatibility rules. 

For example, consider this thrift structs:
{code:java}
struct V1 {
  1: required string f1,
  2: optional string f2
}

struct V1 {
 1: required string f1,
 2: optional string f2,
 3: optional string f3
}{code}

Produced schemas will be: 

{noformat}
{"type":"record","name":"V1","namespace":"serialization.thrift.test","fields":[{"name":"f1","type":["null",{"type":"string","avro.java.string":"String"}]},{"name":"f2","type":["null",{"type":"string","avro.java.string":"String"}],"default":null}]}

{"type":"record","name":"V2","namespace":"serialization.thrift.test","fields":[{"name":"f1","type":["null",{"type":"string","avro.java.string":"String"}]},{"name":"f2","type":["null",{"type":"string","avro.java.string":"String"}]}, {"name":"f3","type":["null",{"type":"string","avro.java.string":"String"}]}]}
{noformat}

The problem is that if i try to check this schemas using BACKWARD compatibility checker i will get false, because fields f2 and f3 has no default values even if they are optional.

Also, if i use default value in my thrift definition the resulting avro schema will not contain it.

There is possibility to fix default null values for optional fields using NULL_DEFAULT_VALUE, but it will ignore the real default values.
-To get the real default values specified in *.thrift we can use instance of thrift message to get default value, but this will require some refactoring of such methods as getSchema and nullable in ThriftData.class-

  was:
Currently, ThrifdData class produces not compatible avro schema in terms of AvroCompatibility rules. 

For example, consider this thrift structs:
{code:java}
struct V1 {
  1: required string f1,
  2: optional string f2
}

struct V1 {
 1: required string f1,
 2: optional string f2,
 3: optional string f3
}{code}

Produced schemas will be: 

{noformat}
{"type":"record","name":"V1","namespace":"serialization.thrift.test","fields":[{"name":"f1","type":["null",{"type":"string","avro.java.string":"String"}]},{"name":"f2","type":["null",{"type":"string","avro.java.string":"String"}],"default":null}]}

{"type":"record","name":"V2","namespace":"serialization.thrift.test","fields":[{"name":"f1","type":["null",{"type":"string","avro.java.string":"String"}]},{"name":"f2","type":["null",{"type":"string","avro.java.string":"String"}]}, {"name":"f3","type":["null",{"type":"string","avro.java.string":"String"}]}]}
{noformat}

The problem is that if i try to check this schemas using BACKWARD compatibility checker i will get false, because fields f2 and f3 has no default values even if they are optional.

Also, if i use default value in my thrift definition the resulting avro schema will not contain it.

There is possibility to fix default null values for optional fields using NULL_DEFAULT_VALUE, but it will ignore the real default values. To honour the real default values specified in *.thrift we can use instance of thrift message to get default value, but this will require some refactoring of such methods as getSchema and nullable in ThriftData.class



> ThriftData produces not compatible avro schemas
> -----------------------------------------------
>
>                 Key: AVRO-2539
>                 URL: https://issues.apache.org/jira/browse/AVRO-2539
>             Project: Apache Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.9.0
>            Reporter: Nikita Ryanov
>            Priority: Major
>
> Currently, ThrifdData class produces not compatible avro schema in terms of AvroCompatibility rules. 
> For example, consider this thrift structs:
> {code:java}
> struct V1 {
>   1: required string f1,
>   2: optional string f2
> }
> struct V1 {
>  1: required string f1,
>  2: optional string f2,
>  3: optional string f3
> }{code}
> Produced schemas will be: 
> {noformat}
> {"type":"record","name":"V1","namespace":"serialization.thrift.test","fields":[{"name":"f1","type":["null",{"type":"string","avro.java.string":"String"}]},{"name":"f2","type":["null",{"type":"string","avro.java.string":"String"}],"default":null}]}
> {"type":"record","name":"V2","namespace":"serialization.thrift.test","fields":[{"name":"f1","type":["null",{"type":"string","avro.java.string":"String"}]},{"name":"f2","type":["null",{"type":"string","avro.java.string":"String"}]}, {"name":"f3","type":["null",{"type":"string","avro.java.string":"String"}]}]}
> {noformat}
> The problem is that if i try to check this schemas using BACKWARD compatibility checker i will get false, because fields f2 and f3 has no default values even if they are optional.
> Also, if i use default value in my thrift definition the resulting avro schema will not contain it.
> There is possibility to fix default null values for optional fields using NULL_DEFAULT_VALUE, but it will ignore the real default values.
> -To get the real default values specified in *.thrift we can use instance of thrift message to get default value, but this will require some refactoring of such methods as getSchema and nullable in ThriftData.class-



--
This message was sent by Atlassian Jira
(v8.3.2#803003)