You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@avro.apache.org by Elliot West <te...@gmail.com> on 2017/02/17 12:27:32 UTC

Implementation of compatibility rules

Hi,

I've been attempting to understand the implementation of Avro schema
compatibility rules and am slightly confused by the structure of the code.
It seems that there are at least two possible entry points:

   - org.apache.avro.SchemaCompatibility.checkReaderWriterCompatibility(Schema,
   Schema)
   - org.apache.avro.SchemaValidatorBuilder

The code paths of these do not seem to intersect, with one implementing a
static set of rule checks and the other seemingly delegating to grammar
based approach. Does this imply that there are in fact two implementations
of the compatibility rules?

Apologies if this is a naïve question.

Thanks,

Elliot.

Re: Implementation of compatibility rules

Posted by Doug Cutting <cu...@gmail.com>.

Support for aliases should be easy to add by calling
Schema#applyAliases before the compatibility check.

Whether aliases should be applied depends on whether the compatibility
check is meant to be valid only for implementations that support
aliases or also ones that do not.

Note that support for aliases might be implemented through a service.
A schema registry service could be extended to also apply aliases.  A
command to retrieve a writer's schema with a given ID could also be
provided the reader's schema, and its result would be the writer's
schema with the reader's aliases applied.

Doug

On Wed, Feb 22, 2017 at 8:47 AM, Joseph P. <jo...@gmail.com> wrote:
> This change (considering alias in schema compatibility) is really welcomed
> and needed in our usage of it. So thanks a lot for this much needed change
> (IMHO).
>
> best,
> joseph
>
> On Wed, Feb 22, 2017 at 4:55 PM, Elliot West <te...@gmail.com> wrote:
>>
>> Update:
>>
>> I had a go at modifying org.apache.avro.SchemaValidatorBuilder to use
>> SchemaCompatibility and have then run schema compatibility test suites from
>> both the Avro project and Confluent's Schema registry. Every case that is
>> tested appears to continue to function correctly with one exception;
>> SchemaCompatibility appears to favourably consider aliases when performing
>> name based compatibility checks whereas the implementation provided via
>> SchemaValidatorBuilder is more strict, and does not.
>>
>> The specification makes no definitive judgement on the matter, simply
>> stating that 'an implementation may optionally use aliases'. Should perhaps
>> this be configurable in the aforementioned implementations so that the user
>> can decide and also have a chance of obtaining consistent behaviour?
>>
>> Elliot.
>>
>> On 22 February 2017 at 13:48, Elliot West <te...@gmail.com> wrote:
>>>
>>> Further to this, is there any reason why conceptually, the implementation
>>> of org.apache.avro.ValidateMutualRead.canRead(Schema, Schema) could not be
>>> changed from:
>>>
>>>   static void canRead(Schema writtenWith, Schema readUsing)
>>>       throws SchemaValidationException {
>>>     boolean error;
>>>     try {
>>>       error = Symbol.hasErrors(new ResolvingGrammarGenerator().generate(
>>>           writtenWith, readUsing));
>>>     } catch (IOException e) {
>>>       throw new SchemaValidationException(readUsing, writtenWith, e);
>>>     }
>>>     if (error) {
>>>       throw new SchemaValidationException(readUsing, writtenWith);
>>>     }
>>>   }
>>>
>>>
>>> to:
>>>
>>>   static void canRead(Schema writtenWith, Schema readUsing)
>>>       throws SchemaValidationException {
>>>     SchemaCompatibilityType compatibilityType
>>>       = SchemaCompatibility.checkReaderWriterCompatibility(readUsing,
>>> writtenWith).getType();
>>>     if (compatibilityType != SchemaCompatibilityType.COMPATIBLE) {
>>>       throw new SchemaValidationException(readUsing, writtenWith);
>>>     }
>>>   }
>>>
>>>
>>> Or am I missing something fundamental?
>>>
>>> Thanks,
>>>
>>> Elliot.
>>>
>>> On 17 February 2017 at 12:27, Elliot West <te...@gmail.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I've been attempting to understand the implementation of Avro schema
>>>> compatibility rules and am slightly confused by the structure of the code.
>>>> It seems that there are at least two possible entry points:
>>>>
>>>>
>>>> org.apache.avro.SchemaCompatibility.checkReaderWriterCompatibility(Schema,
>>>> Schema)
>>>> org.apache.avro.SchemaValidatorBuilder
>>>>
>>>> The code paths of these do not seem to intersect, with one implementing
>>>> a static set of rule checks and the other seemingly delegating to grammar
>>>> based approach. Does this imply that there are in fact two implementations
>>>> of the compatibility rules?
>>>>
>>>> Apologies if this is a naïve question.
>>>>
>>>> Thanks,
>>>>
>>>> Elliot.
>>>
>>>
>>
>

Re: Implementation of compatibility rules

Posted by "Joseph P." <jo...@gmail.com>.

This change (considering alias in schema compatibility) is really welcomed
and needed in our usage of it. So thanks a lot for this much needed change
(IMHO).

best,
joseph

On Wed, Feb 22, 2017 at 4:55 PM, Elliot West <te...@gmail.com> wrote:

> Update:
>
> I had a go at modifying org.apache.avro.SchemaValidatorBuilder to use
> SchemaCompatibility and have then run schema compatibility test suites
> from both the Avro project and Confluent's Schema registry. Every case that
> is tested appears to continue to function correctly with one exception;
> SchemaCompatibility appears to favourably consider aliases when
> performing name based compatibility checks whereas the implementation
> provided via SchemaValidatorBuilder is more strict, and does not.
>
> The specification <http://avro.apache.org/docs/1.8.1/spec.html#Aliases>
> makes no definitive judgement on the matter, simply stating that 'an
> implementation may optionally use aliases'. Should perhaps this be
> configurable in the aforementioned implementations so that the user can
> decide and also have a chance of obtaining consistent behaviour?
>
> Elliot.
>
> On 22 February 2017 at 13:48, Elliot West <te...@gmail.com> wrote:
>
>> Further to this, is there any reason why conceptually, the implementation
>> of org.apache.avro.ValidateMutualRead.canRead(Schema, Schema) could not
>> be changed from:
>>
>>   static void canRead(Schema writtenWith, Schema readUsing)
>>       throws SchemaValidationException {
>>     boolean error;
>>     try {
>>       error = Symbol.hasErrors(new ResolvingGrammarGenerator().generate(
>>           writtenWith, readUsing));
>>     } catch (IOException e) {
>>       throw new SchemaValidationException(readUsing, writtenWith, e);
>>     }
>>     if (error) {
>>       throw new SchemaValidationException(readUsing, writtenWith);
>>     }
>>   }
>>
>>
>> to:
>>
>>   static void canRead(Schema writtenWith, Schema readUsing)
>>       throws SchemaValidationException {
>>     SchemaCompatibilityType compatibilityType
>>       = SchemaCompatibility.checkReaderWriterCompatibility(readUsing,
>> writtenWith).getType();
>>     if (compatibilityType != SchemaCompatibilityType.COMPATIBLE) {
>>       throw new SchemaValidationException(readUsing, writtenWith);
>>     }
>>   }
>>
>>
>> Or am I missing something fundamental?
>>
>> Thanks,
>>
>> Elliot.
>>
>> On 17 February 2017 at 12:27, Elliot West <te...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I've been attempting to understand the implementation of Avro schema
>>> compatibility rules and am slightly confused by the structure of the code.
>>> It seems that there are at least two possible entry points:
>>>
>>>    - org.apache.avro.SchemaCompatibility.checkReaderWriterCompatibility(Schema,
>>>    Schema)
>>>    - org.apache.avro.SchemaValidatorBuilder
>>>
>>> The code paths of these do not seem to intersect, with one implementing
>>> a static set of rule checks and the other seemingly delegating to grammar
>>> based approach. Does this imply that there are in fact two implementations
>>> of the compatibility rules?
>>>
>>> Apologies if this is a naïve question.
>>>
>>> Thanks,
>>>
>>> Elliot.
>>>
>>
>>
>

Re: Implementation of compatibility rules

Posted by Elliot West <te...@gmail.com>.

Update:

I had a go at modifying org.apache.avro.SchemaValidatorBuilder to use
SchemaCompatibility and have then run schema compatibility test suites
from both the Avro project and Confluent's Schema registry. Every case that
is tested appears to continue to function correctly with one exception;
SchemaCompatibility appears to favourably consider aliases when performing
name based compatibility checks whereas the implementation provided via
SchemaValidatorBuilder is more strict, and does not.

The specification <http://avro.apache.org/docs/1.8.1/spec.html#Aliases>
makes no definitive judgement on the matter, simply stating that 'an
implementation may optionally use aliases'. Should perhaps this be
configurable in the aforementioned implementations so that the user can
decide and also have a chance of obtaining consistent behaviour?

Elliot.

On 22 February 2017 at 13:48, Elliot West <te...@gmail.com> wrote:

> Further to this, is there any reason why conceptually, the implementation
> of org.apache.avro.ValidateMutualRead.canRead(Schema, Schema) could not
> be changed from:
>
>   static void canRead(Schema writtenWith, Schema readUsing)
>       throws SchemaValidationException {
>     boolean error;
>     try {
>       error = Symbol.hasErrors(new ResolvingGrammarGenerator().generate(
>           writtenWith, readUsing));
>     } catch (IOException e) {
>       throw new SchemaValidationException(readUsing, writtenWith, e);
>     }
>     if (error) {
>       throw new SchemaValidationException(readUsing, writtenWith);
>     }
>   }
>
>
> to:
>
>   static void canRead(Schema writtenWith, Schema readUsing)
>       throws SchemaValidationException {
>     SchemaCompatibilityType compatibilityType
>       = SchemaCompatibility.checkReaderWriterCompatibility(readUsing,
> writtenWith).getType();
>     if (compatibilityType != SchemaCompatibilityType.COMPATIBLE) {
>       throw new SchemaValidationException(readUsing, writtenWith);
>     }
>   }
>
>
> Or am I missing something fundamental?
>
> Thanks,
>
> Elliot.
>
> On 17 February 2017 at 12:27, Elliot West <te...@gmail.com> wrote:
>
>> Hi,
>>
>> I've been attempting to understand the implementation of Avro schema
>> compatibility rules and am slightly confused by the structure of the code.
>> It seems that there are at least two possible entry points:
>>
>>    - org.apache.avro.SchemaCompatibility.checkReaderWriterCompatibility(Schema,
>>    Schema)
>>    - org.apache.avro.SchemaValidatorBuilder
>>
>> The code paths of these do not seem to intersect, with one implementing a
>> static set of rule checks and the other seemingly delegating to grammar
>> based approach. Does this imply that there are in fact two implementations
>> of the compatibility rules?
>>
>> Apologies if this is a naïve question.
>>
>> Thanks,
>>
>> Elliot.
>>
>
>

Re: Implementation of compatibility rules

Posted by Elliot West <te...@gmail.com>.

Further to this, is there any reason why conceptually, the implementation
of org.apache.avro.ValidateMutualRead.canRead(Schema, Schema) could not be
changed from:

  static void canRead(Schema writtenWith, Schema readUsing)
      throws SchemaValidationException {
    boolean error;
    try {
      error = Symbol.hasErrors(new ResolvingGrammarGenerator().generate(
          writtenWith, readUsing));
    } catch (IOException e) {
      throw new SchemaValidationException(readUsing, writtenWith, e);
    }
    if (error) {
      throw new SchemaValidationException(readUsing, writtenWith);
    }
  }


to:

  static void canRead(Schema writtenWith, Schema readUsing)
      throws SchemaValidationException {
    SchemaCompatibilityType compatibilityType
      = SchemaCompatibility.checkReaderWriterCompatibility(readUsing,
writtenWith).getType();
    if (compatibilityType != SchemaCompatibilityType.COMPATIBLE) {
      throw new SchemaValidationException(readUsing, writtenWith);
    }
  }


Or am I missing something fundamental?

Thanks,

Elliot.

On 17 February 2017 at 12:27, Elliot West <te...@gmail.com> wrote:

> Hi,
>
> I've been attempting to understand the implementation of Avro schema
> compatibility rules and am slightly confused by the structure of the code.
> It seems that there are at least two possible entry points:
>
>    - org.apache.avro.SchemaCompatibility.checkReaderWriterCompatibility(Schema,
>    Schema)
>    - org.apache.avro.SchemaValidatorBuilder
>
> The code paths of these do not seem to intersect, with one implementing a
> static set of rule checks and the other seemingly delegating to grammar
> based approach. Does this imply that there are in fact two implementations
> of the compatibility rules?
>
> Apologies if this is a naïve question.
>
> Thanks,
>
> Elliot.
>