You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Yubao Liu (Jira)" <ji...@apache.org> on 2020/03/06 07:40:00 UTC

[jira] [Updated] (AVRO-2772) wrong union schema forward compatibility

     [ https://issues.apache.org/jira/browse/AVRO-2772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yubao Liu updated AVRO-2772:
----------------------------
    Description: 
{code:java}
protocol Test {
    record A {
        int amount;
    }

    record B {
        int amount;
    }

    record C {
       // old:
       union { A } c;
 
       // new:
       union { A, B} c;
    }
}
{code}

The old C schema has "union \{A} c;",   new C schema has "union \{A, B} c;",    suppose we use new C schema and write a B object for field "c",   and use old C schema to read it back,  AVRO will happily return a "A" object for field "c",  this is surprising.

The new and old schema are mutual compatible according to AVRO schema validator.

Attached a maven project to demonstrate the issue,  here is the output:
{code}
readerSchema and writerSchema are mutual compatible
writerSchema unionSchemas.size()=2
writerSchema unionSchemas.get(1)={"type":"record","name":"B","fields":[{"name":"amount","type":"int"}]}
r={"c": {"amount": 12}}
r.c.schema={"type":"record","name":"B","fields":[{"name":"amount","type":"int"}]}
r2={"c": {"amount": 12}}
r2.c.schema={"type":"record","name":"A","fields":[{"name":"amount","type":"int"}]}
{code}




  was:
{code:java}
protocol Test {
    record A {
        int amount;
    }

    record B {
        int amount;
    }

    record C {
       // old:
       union { A } c;
 
       // new:
       union { A, B} c;
    }
}
{code}

The old C schema has "union {A} c;",   new C schema has "union {A, B} c;",    suppose we use new C schema and write a B object for field "c",   and use old C schema to read it back,  AVRO will happily return a "A" object for field "c",  this is surprising.

The new and old schema are mutual compatible according to AVRO schema validator.

Attached a maven project to demonstrate the issue,  here is the output:
{code}
readerSchema and writerSchema are mutual compatible
writerSchema unionSchemas.size()=2
writerSchema unionSchemas.get(1)={"type":"record","name":"B","fields":[{"name":"amount","type":"int"}]}
r={"c": {"amount": 12}}
r.c.schema={"type":"record","name":"B","fields":[{"name":"amount","type":"int"}]}
r2={"c": {"amount": 12}}
r2.c.schema={"type":"record","name":"A","fields":[{"name":"amount","type":"int"}]}
{code}





> wrong union schema forward compatibility 
> -----------------------------------------
>
>                 Key: AVRO-2772
>                 URL: https://issues.apache.org/jira/browse/AVRO-2772
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.9.2
>         Environment: JDK 13,  Maven 3.6.3,  avro 1.9.2.
>            Reporter: Yubao Liu
>            Priority: Major
>         Attachments: avro-union.tar.gz
>
>
> {code:java}
> protocol Test {
>     record A {
>         int amount;
>     }
>     record B {
>         int amount;
>     }
>     record C {
>        // old:
>        union { A } c;
>  
>        // new:
>        union { A, B} c;
>     }
> }
> {code}
> The old C schema has "union \{A} c;",   new C schema has "union \{A, B} c;",    suppose we use new C schema and write a B object for field "c",   and use old C schema to read it back,  AVRO will happily return a "A" object for field "c",  this is surprising.
> The new and old schema are mutual compatible according to AVRO schema validator.
> Attached a maven project to demonstrate the issue,  here is the output:
> {code}
> readerSchema and writerSchema are mutual compatible
> writerSchema unionSchemas.size()=2
> writerSchema unionSchemas.get(1)={"type":"record","name":"B","fields":[{"name":"amount","type":"int"}]}
> r={"c": {"amount": 12}}
> r.c.schema={"type":"record","name":"B","fields":[{"name":"amount","type":"int"}]}
> r2={"c": {"amount": 12}}
> r2.c.schema={"type":"record","name":"A","fields":[{"name":"amount","type":"int"}]}
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)