You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by KV 59 <kv...@gmail.com> on 2020/07/07 04:36:08 UTC

SchemaValidator vs SchemaCompatibility

Which is the right class to use to check compatibility?

I'm using Avro 1.9.2 and I'm trying to check the compatibility for the
fo schemas using the SchemaCompatibility and I can't figure what the issue
is and why it says incompatible

Reader Schemas
-------------------------------------

>   {
>     "type": "record",
>     "name": "WorksheetCompleted",
>     "namespace": "com.school.avro",
>     "doc": "Emitted After an Student completed working on worksheet",
>     "fields": [
>       {
>         "name": "worksheet",
>         "type": {
>           "type": "record",
>           "name": "Worksheet",
>           "doc": "Completed worksheet",
>           "fields": [
>             {
>               "name": "worksheet1",
>               "type": {
>                 "type": "array",
>                 "items": {
>                   "type": "record",
>                   "name": "WorksheetItem",
>                   "doc": "One worksheet question with an answer",
>                   "fields": [
>                     {
>                       "name": "question_id",
>                       "type": "long",
>                       "doc": "Question id"
>                     },
>                     {
>                       "name": "answer",
>                       "type": [
>                         "null",
>                         "string"
>                       ],
>                       "doc": "Answer",
>                       "default": null
>                     }
>                   ]
>                 }
>               },
>               "doc": "Collection of worksheet questions with answers"
>             }
>           ]
>         }
>       }
>     ]
>   }
>

Writer Schema
----------------------------------------------
  {
    "type": "record",
    "name": "WorksheetCompleted",
    "namespace": "com.school.avro",
    "doc": "Emitted After an Student completed working on worksheet",
    "fields": [
      {
        "name": "worksheet",
        "type": {
          "type": "record",
          "name": "Worksheet",
          "doc": "Completed worksheet",
          "fields": [
            {
              "name": "worksheet_items",
              "type": {
                "type": "array",
                "items": {
                  "type": "record",
                  "name": "WorksheetItem",
                  "doc": "One worksheet question with an answer",
                  "fields": [
                    {
                      "name": "question_id",
                      "type": "long",
                      "doc": "Question id"
                    },
                    {
                      "name": "answer",
                      "type": [
                        "null",
                        "string"
                      ],
                      "doc": "Answer",
                      "default": null
                    }
                  ]
                }
              },
              "doc": "Collection of worksheet questions with answers",
              "aliases": [
                "worksheet1"
              ]
            },
            {
              "name": "student",
              "type": [
                "null",
                "string"
              ],
              "doc": "an Student who completed the worksheet",
              "default": null
            },
            {
              "name": "duration",
              "type": [
                "null",
                "long"
              ],
              "doc": "Worksheet duration in milliseconds",
              "default": null
            }
          ]
        }
      }
    ]

>   }


I get an error

INCOMPATIBLE
SchemaCompatibilityResult{compatibility:INCOMPATIBLE,
incompatibilities:[Incompatibility{type:READER_FIELD_MISSING_DEFAULT_VALUE,
location:/fields/0/type/fields/0, message:worksheet1,
reader:{"type":"record","name":"Worksheet","namespace":"com.school.avro","doc":"Completed
worksheet","fields":[{"name":"worksheet1","type":{"type":"array","items":{"type":"record","name":"WorksheetItem","doc":"One
worksheet question with an
answer","fields":[{"name":"question_id","type":"long","doc":"Question
id"},{"name":"answer","type":["null","string"],"doc":"Answer","default":null}]}},"doc":"Collection
of worksheet questions with answers"}]},
writer:{"type":"record","name":"Worksheet","namespace":"com.school.avro","doc":"Completed
worksheet","fields":[{"name":"worksheet_items","type":{"type":"array","items":{"type":"record","name":"WorksheetItem","doc":"One
worksheet question with an
answer","fields":[{"name":"question_id","type":"long","doc":"Question
id"},{"name":"answer","type":["null","string"],"doc":"Answer","default":null}]}},"doc":"Collection
of worksheet questions with
answers","aliases":["worksheet1"]},{"name":"student","type":["null","string"],"doc":"an
Student who completed the
worksheet","default":null},{"name":"duration","type":["null","long"],"doc":"Worksheet
duration in milliseconds","default":null}]}}]}


What is that I'm doing wrong?

Also I would like to know which Class to use to check copatibility

Thanks

Re: SchemaValidator vs SchemaCompatibility

Posted by KV 59 <kv...@gmail.com>.
Hi Elliot,

I figured just that and I think it is a bug in the SchemaCompatibility
implementation. As per the Avro Spec it says

Aliases function by re-writing the writer's schema using aliases from the
> reader's schema. For example, if the writer's schema was named "Foo" and
> the reader's schema is named "Bar" and has an alias of "Foo", then the
> implementation would act as though "Foo" were named "Bar" when reading.
> Similarly, *if data was written as a record with a field named "x" and is
> read as a record with a field named "y" with alias "x", then the
> implementation would act as though "x" were named "y" when reading*.


The look up in the SchemaCompatibility.java
https://github.com/apache/avro/blob/2c7b9af7d5ba35afe9cf84eae3b273a6df0612b1/lang/java/avro/src/main/java/org/apache/avro/SchemaCompatibility.java#L116


Looks for aliases only in the Reader schema. I believe it should look the
other way as well. By not looking it breaks the Forward Compatibility of
schemas. In other words, if I have to rename a field in.a newer version of
the schema, I have to modify the previous version to add an alias (This
doesn't see right to me)

Thanks
Kishore

On Tue, Jul 7, 2020 at 12:11 AM Elliot West <te...@gmail.com> wrote:

> The error in question is: READER_FIELD_MISSING_DEFAULT_VALUE,
> location:/fields/0/type/fields/0
>
> READER_FIELD_MISSING_DEFAULT_VALUE indicates that the reader requires a
> default value on a field
> The field can be identified with the JSON pointer: /fields/0/type/fields/0
>
> Applying the pointer to the reader schema suggests that you need to
> specify a default value for field worksheet.worksheet1
>
> At first glance, this appears correct as field worksheet1 is not present
> in the type Worksheet in the writer schema, and so the reader would need a
> substitute default value when writing. However, I notice that you do have
> an alias in the worksheet_items field of the writer schema, mapping to the
> name worksheet1. This will not work, as I understand it, because aliases
> are a property of the schema, not the data, and so the reader will be
> unaware of the alias declared on the writer schema. I expect what you need
> to do is instead declare an alias in the reader schema on
> the worksheet.worksheet1 field:
>
> "aliases": [
>   "worksheet_items"
> ]
>
> Thanks,
>
> Elliot.
>
> On Tue, 7 Jul 2020 at 05:36, KV 59 <kv...@gmail.com> wrote:
>
>> Which is the right class to use to check compatibility?
>>
>> I'm using Avro 1.9.2 and I'm trying to check the compatibility for the
>> fo schemas using the SchemaCompatibility and I can't figure what the issue
>> is and why it says incompatible
>>
>> Reader Schemas
>> -------------------------------------
>>
>>>   {
>>>     "type": "record",
>>>     "name": "WorksheetCompleted",
>>>     "namespace": "com.school.avro",
>>>     "doc": "Emitted After an Student completed working on worksheet",
>>>     "fields": [
>>>       {
>>>         "name": "worksheet",
>>>         "type": {
>>>           "type": "record",
>>>           "name": "Worksheet",
>>>           "doc": "Completed worksheet",
>>>           "fields": [
>>>             {
>>>               "name": "worksheet1",
>>>               "type": {
>>>                 "type": "array",
>>>                 "items": {
>>>                   "type": "record",
>>>                   "name": "WorksheetItem",
>>>                   "doc": "One worksheet question with an answer",
>>>                   "fields": [
>>>                     {
>>>                       "name": "question_id",
>>>                       "type": "long",
>>>                       "doc": "Question id"
>>>                     },
>>>                     {
>>>                       "name": "answer",
>>>                       "type": [
>>>                         "null",
>>>                         "string"
>>>                       ],
>>>                       "doc": "Answer",
>>>                       "default": null
>>>                     }
>>>                   ]
>>>                 }
>>>               },
>>>               "doc": "Collection of worksheet questions with answers"
>>>             }
>>>           ]
>>>         }
>>>       }
>>>     ]
>>>   }
>>>
>>
>> Writer Schema
>> ----------------------------------------------
>>   {
>>     "type": "record",
>>     "name": "WorksheetCompleted",
>>     "namespace": "com.school.avro",
>>     "doc": "Emitted After an Student completed working on worksheet",
>>     "fields": [
>>       {
>>         "name": "worksheet",
>>         "type": {
>>           "type": "record",
>>           "name": "Worksheet",
>>           "doc": "Completed worksheet",
>>           "fields": [
>>             {
>>               "name": "worksheet_items",
>>               "type": {
>>                 "type": "array",
>>                 "items": {
>>                   "type": "record",
>>                   "name": "WorksheetItem",
>>                   "doc": "One worksheet question with an answer",
>>                   "fields": [
>>                     {
>>                       "name": "question_id",
>>                       "type": "long",
>>                       "doc": "Question id"
>>                     },
>>                     {
>>                       "name": "answer",
>>                       "type": [
>>                         "null",
>>                         "string"
>>                       ],
>>                       "doc": "Answer",
>>                       "default": null
>>                     }
>>                   ]
>>                 }
>>               },
>>               "doc": "Collection of worksheet questions with answers",
>>               "aliases": [
>>                 "worksheet1"
>>               ]
>>             },
>>             {
>>               "name": "student",
>>               "type": [
>>                 "null",
>>                 "string"
>>               ],
>>               "doc": "an Student who completed the worksheet",
>>               "default": null
>>             },
>>             {
>>               "name": "duration",
>>               "type": [
>>                 "null",
>>                 "long"
>>               ],
>>               "doc": "Worksheet duration in milliseconds",
>>               "default": null
>>             }
>>           ]
>>         }
>>       }
>>     ]
>>
>>>   }
>>
>>
>> I get an error
>>
>> INCOMPATIBLE
>> SchemaCompatibilityResult{compatibility:INCOMPATIBLE,
>> incompatibilities:[Incompatibility{type:READER_FIELD_MISSING_DEFAULT_VALUE,
>> location:/fields/0/type/fields/0, message:worksheet1,
>> reader:{"type":"record","name":"Worksheet","namespace":"com.school.avro","doc":"Completed
>> worksheet","fields":[{"name":"worksheet1","type":{"type":"array","items":{"type":"record","name":"WorksheetItem","doc":"One
>> worksheet question with an
>> answer","fields":[{"name":"question_id","type":"long","doc":"Question
>> id"},{"name":"answer","type":["null","string"],"doc":"Answer","default":null}]}},"doc":"Collection
>> of worksheet questions with answers"}]},
>> writer:{"type":"record","name":"Worksheet","namespace":"com.school.avro","doc":"Completed
>> worksheet","fields":[{"name":"worksheet_items","type":{"type":"array","items":{"type":"record","name":"WorksheetItem","doc":"One
>> worksheet question with an
>> answer","fields":[{"name":"question_id","type":"long","doc":"Question
>> id"},{"name":"answer","type":["null","string"],"doc":"Answer","default":null}]}},"doc":"Collection
>> of worksheet questions with
>> answers","aliases":["worksheet1"]},{"name":"student","type":["null","string"],"doc":"an
>> Student who completed the
>> worksheet","default":null},{"name":"duration","type":["null","long"],"doc":"Worksheet
>> duration in milliseconds","default":null}]}}]}
>>
>>
>> What is that I'm doing wrong?
>>
>> Also I would like to know which Class to use to check copatibility
>>
>> Thanks
>>
>

Re: SchemaValidator vs SchemaCompatibility

Posted by Elliot West <te...@gmail.com>.
The error in question is: READER_FIELD_MISSING_DEFAULT_VALUE,
location:/fields/0/type/fields/0

READER_FIELD_MISSING_DEFAULT_VALUE indicates that the reader requires a
default value on a field
The field can be identified with the JSON pointer: /fields/0/type/fields/0

Applying the pointer to the reader schema suggests that you need to specify
a default value for field worksheet.worksheet1

At first glance, this appears correct as field worksheet1 is not present in
the type Worksheet in the writer schema, and so the reader would need a
substitute default value when writing. However, I notice that you do have
an alias in the worksheet_items field of the writer schema, mapping to the
name worksheet1. This will not work, as I understand it, because aliases
are a property of the schema, not the data, and so the reader will be
unaware of the alias declared on the writer schema. I expect what you need
to do is instead declare an alias in the reader schema on
the worksheet.worksheet1 field:

"aliases": [
  "worksheet_items"
]

Thanks,

Elliot.

On Tue, 7 Jul 2020 at 05:36, KV 59 <kv...@gmail.com> wrote:

> Which is the right class to use to check compatibility?
>
> I'm using Avro 1.9.2 and I'm trying to check the compatibility for the
> fo schemas using the SchemaCompatibility and I can't figure what the issue
> is and why it says incompatible
>
> Reader Schemas
> -------------------------------------
>
>>   {
>>     "type": "record",
>>     "name": "WorksheetCompleted",
>>     "namespace": "com.school.avro",
>>     "doc": "Emitted After an Student completed working on worksheet",
>>     "fields": [
>>       {
>>         "name": "worksheet",
>>         "type": {
>>           "type": "record",
>>           "name": "Worksheet",
>>           "doc": "Completed worksheet",
>>           "fields": [
>>             {
>>               "name": "worksheet1",
>>               "type": {
>>                 "type": "array",
>>                 "items": {
>>                   "type": "record",
>>                   "name": "WorksheetItem",
>>                   "doc": "One worksheet question with an answer",
>>                   "fields": [
>>                     {
>>                       "name": "question_id",
>>                       "type": "long",
>>                       "doc": "Question id"
>>                     },
>>                     {
>>                       "name": "answer",
>>                       "type": [
>>                         "null",
>>                         "string"
>>                       ],
>>                       "doc": "Answer",
>>                       "default": null
>>                     }
>>                   ]
>>                 }
>>               },
>>               "doc": "Collection of worksheet questions with answers"
>>             }
>>           ]
>>         }
>>       }
>>     ]
>>   }
>>
>
> Writer Schema
> ----------------------------------------------
>   {
>     "type": "record",
>     "name": "WorksheetCompleted",
>     "namespace": "com.school.avro",
>     "doc": "Emitted After an Student completed working on worksheet",
>     "fields": [
>       {
>         "name": "worksheet",
>         "type": {
>           "type": "record",
>           "name": "Worksheet",
>           "doc": "Completed worksheet",
>           "fields": [
>             {
>               "name": "worksheet_items",
>               "type": {
>                 "type": "array",
>                 "items": {
>                   "type": "record",
>                   "name": "WorksheetItem",
>                   "doc": "One worksheet question with an answer",
>                   "fields": [
>                     {
>                       "name": "question_id",
>                       "type": "long",
>                       "doc": "Question id"
>                     },
>                     {
>                       "name": "answer",
>                       "type": [
>                         "null",
>                         "string"
>                       ],
>                       "doc": "Answer",
>                       "default": null
>                     }
>                   ]
>                 }
>               },
>               "doc": "Collection of worksheet questions with answers",
>               "aliases": [
>                 "worksheet1"
>               ]
>             },
>             {
>               "name": "student",
>               "type": [
>                 "null",
>                 "string"
>               ],
>               "doc": "an Student who completed the worksheet",
>               "default": null
>             },
>             {
>               "name": "duration",
>               "type": [
>                 "null",
>                 "long"
>               ],
>               "doc": "Worksheet duration in milliseconds",
>               "default": null
>             }
>           ]
>         }
>       }
>     ]
>
>>   }
>
>
> I get an error
>
> INCOMPATIBLE
> SchemaCompatibilityResult{compatibility:INCOMPATIBLE,
> incompatibilities:[Incompatibility{type:READER_FIELD_MISSING_DEFAULT_VALUE,
> location:/fields/0/type/fields/0, message:worksheet1,
> reader:{"type":"record","name":"Worksheet","namespace":"com.school.avro","doc":"Completed
> worksheet","fields":[{"name":"worksheet1","type":{"type":"array","items":{"type":"record","name":"WorksheetItem","doc":"One
> worksheet question with an
> answer","fields":[{"name":"question_id","type":"long","doc":"Question
> id"},{"name":"answer","type":["null","string"],"doc":"Answer","default":null}]}},"doc":"Collection
> of worksheet questions with answers"}]},
> writer:{"type":"record","name":"Worksheet","namespace":"com.school.avro","doc":"Completed
> worksheet","fields":[{"name":"worksheet_items","type":{"type":"array","items":{"type":"record","name":"WorksheetItem","doc":"One
> worksheet question with an
> answer","fields":[{"name":"question_id","type":"long","doc":"Question
> id"},{"name":"answer","type":["null","string"],"doc":"Answer","default":null}]}},"doc":"Collection
> of worksheet questions with
> answers","aliases":["worksheet1"]},{"name":"student","type":["null","string"],"doc":"an
> Student who completed the
> worksheet","default":null},{"name":"duration","type":["null","long"],"doc":"Worksheet
> duration in milliseconds","default":null}]}}]}
>
>
> What is that I'm doing wrong?
>
> Also I would like to know which Class to use to check copatibility
>
> Thanks
>