You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Jurgis Pods (Jira)" <ji...@apache.org> on 2020/04/08 10:05:00 UTC

[jira] [Updated] (AVRO-2793) Schema compatibilty should consider fullname of records

     [ https://issues.apache.org/jira/browse/AVRO-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jurgis Pods updated AVRO-2793:
------------------------------
    Description: 
Consider the following example:
{code:java}
Schema writerSchema = Schema.createRecord("fieldname", null, "namespace1", false, Collections.emptyList());
Schema readerSchema = Schema.createRecord("fieldname", null, "namespace2", false, Collections.emptyList());

// compat.getType() should be SchemaCompatibilityType.INCOMPATIBLE, but is actually SchemaCompatibilityType.COMPATIBLE  
SchemaPairCompatibility compat = SchemaCompatibility.checkReaderWriterCompatibility(readerSchema, writerSchema2){code}
I would expect the validation to yield an incompatible result, as records should have identical fullnames.

This issue is similar to AVRO-2322, but vice versa: Here the namespace differs, not the record name.

The root cause seems to be in [SchemaCompatibility::schemaNameEquals|https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/SchemaCompatibility.java#L97], where getName() is used instead of getFullName().

Is there any reason not to be strict here and use the fullname for validation? We ran into severe problems after changing a record's namespace in a newer schema version. The Avro schema compatibiltiy check ran through fine, so we deployed with confidence. However, the change then caused problems both for Confluent's kafka-s3-connector as well as for Amazon Athena when reading data produced by the new schema.

  was:
Consider the following example:
{code:java}
Schema writerSchema = Schema.createRecord("fieldname", null, "namespace1", false, Collections.emptyList());
Schema readerSchema = Schema.createRecord("fieldname", null, "namespace2", false, Collections.emptyList());

// compat.getType() should be SchemaCompatibilityType.INCOMPATIBLE, but is actually SchemaCompatibilityType.COMPATIBLE  
SchemaPairCompatibility compat = SchemaCompatibility.checkReaderWriterCompatibility(readerSchema, writerSchema2){code}
I would expect the validation to yield an incompatible result, as records should have identical fullnames.

This issue is similar to AVRO-2322, but vice versa: Here the namespace differs, not the record name.

The root cause seems to be in [SchemaCompatibility::schemaNameEquals|[https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/SchemaCompatibility.java#L97]|https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/SchemaCompatibility.java#L97),], where getName() is used instead of getFullName().

Is there any reason not to be strict here and use the fullname for validation? We ran into severe problems after changing a record's namespace in a newer schema version. The Avro schema compatibiltiy check ran through fine, so we deployed with confidence. However, the change then caused problems both for Confluent's kafka-s3-connector as well as for Amazon Athena when reading data produced by the new schema.


> Schema compatibilty should consider fullname of records
> -------------------------------------------------------
>
>                 Key: AVRO-2793
>                 URL: https://issues.apache.org/jira/browse/AVRO-2793
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.9.2
>            Reporter: Jurgis Pods
>            Priority: Major
>
> Consider the following example:
> {code:java}
> Schema writerSchema = Schema.createRecord("fieldname", null, "namespace1", false, Collections.emptyList());
> Schema readerSchema = Schema.createRecord("fieldname", null, "namespace2", false, Collections.emptyList());
> // compat.getType() should be SchemaCompatibilityType.INCOMPATIBLE, but is actually SchemaCompatibilityType.COMPATIBLE  
> SchemaPairCompatibility compat = SchemaCompatibility.checkReaderWriterCompatibility(readerSchema, writerSchema2){code}
> I would expect the validation to yield an incompatible result, as records should have identical fullnames.
> This issue is similar to AVRO-2322, but vice versa: Here the namespace differs, not the record name.
> The root cause seems to be in [SchemaCompatibility::schemaNameEquals|https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/SchemaCompatibility.java#L97], where getName() is used instead of getFullName().
> Is there any reason not to be strict here and use the fullname for validation? We ran into severe problems after changing a record's namespace in a newer schema version. The Avro schema compatibiltiy check ran through fine, so we deployed with confidence. However, the change then caused problems both for Confluent's kafka-s3-connector as well as for Amazon Athena when reading data produced by the new schema.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)