You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Yubao Liu (Jira)" <ji...@apache.org> on 2020/03/06 07:40:00 UTC
[jira] [Updated] (AVRO-2772) wrong union schema forward
compatibility
[ https://issues.apache.org/jira/browse/AVRO-2772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yubao Liu updated AVRO-2772:
----------------------------
Description:
{code:java}
protocol Test {
record A {
int amount;
}
record B {
int amount;
}
record C {
// old:
union { A } c;
// new:
union { A, B} c;
}
}
{code}
The old C schema has "union \{A} c;", new C schema has "union \{A, B} c;", suppose we use new C schema and write a B object for field "c", and use old C schema to read it back, AVRO will happily return a "A" object for field "c", this is surprising.
The new and old schema are mutual compatible according to AVRO schema validator.
Attached a maven project to demonstrate the issue, here is the output:
{code}
readerSchema and writerSchema are mutual compatible
writerSchema unionSchemas.size()=2
writerSchema unionSchemas.get(1)={"type":"record","name":"B","fields":[{"name":"amount","type":"int"}]}
r={"c": {"amount": 12}}
r.c.schema={"type":"record","name":"B","fields":[{"name":"amount","type":"int"}]}
r2={"c": {"amount": 12}}
r2.c.schema={"type":"record","name":"A","fields":[{"name":"amount","type":"int"}]}
{code}
was:
{code:java}
protocol Test {
record A {
int amount;
}
record B {
int amount;
}
record C {
// old:
union { A } c;
// new:
union { A, B} c;
}
}
{code}
The old C schema has "union {A} c;", new C schema has "union {A, B} c;", suppose we use new C schema and write a B object for field "c", and use old C schema to read it back, AVRO will happily return a "A" object for field "c", this is surprising.
The new and old schema are mutual compatible according to AVRO schema validator.
Attached a maven project to demonstrate the issue, here is the output:
{code}
readerSchema and writerSchema are mutual compatible
writerSchema unionSchemas.size()=2
writerSchema unionSchemas.get(1)={"type":"record","name":"B","fields":[{"name":"amount","type":"int"}]}
r={"c": {"amount": 12}}
r.c.schema={"type":"record","name":"B","fields":[{"name":"amount","type":"int"}]}
r2={"c": {"amount": 12}}
r2.c.schema={"type":"record","name":"A","fields":[{"name":"amount","type":"int"}]}
{code}
> wrong union schema forward compatibility
> -----------------------------------------
>
> Key: AVRO-2772
> URL: https://issues.apache.org/jira/browse/AVRO-2772
> Project: Apache Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.9.2
> Environment: JDK 13, Maven 3.6.3, avro 1.9.2.
> Reporter: Yubao Liu
> Priority: Major
> Attachments: avro-union.tar.gz
>
>
> {code:java}
> protocol Test {
> record A {
> int amount;
> }
> record B {
> int amount;
> }
> record C {
> // old:
> union { A } c;
>
> // new:
> union { A, B} c;
> }
> }
> {code}
> The old C schema has "union \{A} c;", new C schema has "union \{A, B} c;", suppose we use new C schema and write a B object for field "c", and use old C schema to read it back, AVRO will happily return a "A" object for field "c", this is surprising.
> The new and old schema are mutual compatible according to AVRO schema validator.
> Attached a maven project to demonstrate the issue, here is the output:
> {code}
> readerSchema and writerSchema are mutual compatible
> writerSchema unionSchemas.size()=2
> writerSchema unionSchemas.get(1)={"type":"record","name":"B","fields":[{"name":"amount","type":"int"}]}
> r={"c": {"amount": 12}}
> r.c.schema={"type":"record","name":"B","fields":[{"name":"amount","type":"int"}]}
> r2={"c": {"amount": 12}}
> r2.c.schema={"type":"record","name":"A","fields":[{"name":"amount","type":"int"}]}
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)