You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by Matthew Chng <ma...@shopback.com.INVALID> on 2022/12/07 08:29:04 UTC

Fwd: Possible bug with avro-js with binary serdes and schema resolution

Hi,

Could not reach the users mailing list so testing the dev mailing list.
Can't find contact email for Jira account creation. Can't join Slack
because I have no ASF email account.

Issue described below.

Thanks,
Matthew

---------- Forwarded message ---------
From: Matthew Chng <ma...@shopback.com>
Date: Wed, Dec 7, 2022 at 4:18 PM
Subject: Possible bug with avro-js with binary serdes and schema resolution
To: <us...@avro.apache.org>


Hi all,
I am encountering an issue with the avro.js' NPM module where Avro
serialized into binary buffers are not readable by a different but
compatible reader schema (evolved). This issue is only occurring when using
the `toBuffer()` and `fromBuffer()` methods and works as expected when
using the `toString()` and `fromString()` JSON serdes methods.
The following is an example of an evolving schema with the difference being
the additional `gender` field that has a default value.

const parentV1Type = avro.parse({
name: 'Parent',
type: 'record',
fields: [
{ name: 'name', type: 'string' }
]
})

const parentV2Type = avro.parse({
name: 'Parent',
type: 'record',
fields: [
{ name: 'name', type: 'string' },
{ name: 'gender', type: 'string', default: 'unspecified' }
]
})

According to
https://avro.apache.org/docs/1.11.1/specification/#schema-resolution
they should be both backwards/forwards reader compatible.
They have these properties.

   - both schemas are records with the same (unqualified) name
   - if the writer’s record contains a field with a name not present in the
   reader’s record, the writer’s value for that field is ignored.
   - if the reader’s record schema has a field that contains a default
   value, and writer’s schema does not have a field with the same name, then
   the reader should use the default value from its field.

Testing with the `toString()` and `fromString()` JSON serdes methods
indicated as such. I've created a simple test script to produce the issue.
Also included test with nested schema. The script is included after the
output. The errors encountered are either

   - truncated buffer; or
   - trailing data

Script output:


--- JSON Writer: ParentV1, Reader: ParentV2 ---
parentV1Json:            {"name":"David"}
parentV2ReadFromV1Json:  {"name":"David","gender":"unspecified"}

--- JSON Writer: ParentV2, Reader: ParentV1 ---
parentV2Json:            {"name":"David","gender":"Father"}
parentV1ReadFromV2Json:  {"name":"David"}

--- Buffer Writer: ParentV1, Reader: ParentV1 ---
parentV1Buffer:            <Buffer 0a 44 61 76 69 64>
parentV1ReadFromV1Buffer:  {"name":"David"}

--- Buffer Writer: ParentV2, Reader: ParentV2 ---
parentV2Buffer:            <Buffer 0a 44 61 76 69 64 0c 46 61 74 68 65 72>
parentV2ReadFromV1Buffer:  {"name":"David","gender":"Father"}

--- Buffer Writer: ParentV1, Reader: ParentV2 ---
parentV1Buffer:           <Buffer 0a 44 61 76 69 64>
parentV2ReadFromV1Buffer: ERROR  truncated buffer

--- Buffer Writer: ParentV2, Reader: ParentV1 ---
parentV2Buffer:           <Buffer 0a 44 61 76 69 64 0c 46 61 74 68 65 72>
parentV1ReadFromV2Buffer: ERROR  trailing data

--- JSON Writer: meWithParentV1, Reader: meWithParentV2 ---
meWithParentV1Json:            {"name":"Davidson","parent":{"name":"David"}}
meWithParentV2ReadFromV1Json:
 {"name":"Davidson","parent":{"name":"David","gender":"unspecified"}}

--- JSON Writer: meWithParentV2, Reader: meWithParentV1 ---
meWithParentV2Json:
 {"name":"Davidson","parent":{"name":"David","gender":"Father"}}
meWithParentV1ReadFromV2Json:  {"name":"Davidson","parent":{"name":"David"}}

--- Buffer Writer: meWithParentV1, Reader: meWithParentV1 ---
meWithParentV1Buffer:            <Buffer 10 44 61 76 69 64 73 6f 6e 0a 44
61 76 69 64>
meWithParentV1ReadFromV1Buffer:
 {"name":"Davidson","parent":{"name":"David"}}

--- Buffer Writer: meWithParentV2, Reader: meWithParentV2 ---
meWithParentV2Buffer:            <Buffer 10 44 61 76 69 64 73 6f 6e 0a 44
61 76 69 64 0c 46 61 74 68 65 72>
meWithParentV2ReadFromV2Buffer:
 {"name":"Davidson","parent":{"name":"David","gender":"Father"}}

--- Buffer Writer: meWithParentV1, Reader: meWithParentV2 ---
meWithParentV1Buffer:           <Buffer 10 44 61 76 69 64 73 6f 6e 0a 44 61
76 69 64>
meWithParentV2ReadFromV1Buffer: ERROR  truncated buffer

--- Buffer Writer: meWithParentV2, Reader: meWithParentV1 ---
meWithParentV2Buffer:           <Buffer 10 44 61 76 69 64 73 6f 6e 0a 44 61
76 69 64 0c 46 61 74 68 65 72>
meWithParentV1ReadFromV2Buffer: ERROR  trailing data


Script src:

const avro = require('avro-js')
const { writer } = require('repl')

const parentV1Type = avro.parse({
name: 'Parent',
type: 'record',
fields: [
{ name: 'name', type: 'string' }
]
})

const parentV2Type = avro.parse({
name: 'Parent',
type: 'record',
fields: [
{ name: 'name', type: 'string' },
{ name: 'gender', type: 'string', default: 'unspecified' }
]
})

const meSchema = {
name: 'Me',
type: 'record',
fields: [
{ name: 'name', type: 'string' },
{ name: 'parent', type: 'Parent' }
]
}

const meWithParentV2Type = avro.parse(meSchema, {
registry: {
Parent: parentV2Type
}
})

const meWithParentV1Type = avro.parse(meSchema, {
registry: {
Parent: parentV1Type
}
})

const parentV1 = { name: 'David'}
const parentV2 = { name: 'David', gender: 'Father' }

const meWithParentV1 = {
name: 'Davidson',
parent: parentV1
}

const meWithParentV2 = {
name: 'Davidson',
parent: parentV2
}

console.log("")
console.log("--- JSON Writer: ParentV1, Reader: ParentV2 ---")
const parentV1Json = parentV1Type.toString(parentV1)
console.log("parentV1Json: ", parentV1Json)
const parentV2ReadFromV1Json = parentV2Type.fromString(parentV1Json)
console.log("parentV2ReadFromV1Json: ", JSON.stringify(
parentV2ReadFromV1Json))

console.log("")
console.log("--- JSON Writer: ParentV2, Reader: ParentV1 ---")
const parentV2Json = parentV2Type.toString(parentV2)
console.log("parentV2Json: ", parentV2Json)
const parentV1ReadFromV2Json = parentV1Type.fromString(parentV2Json)
console.log("parentV1ReadFromV2Json: ", JSON.stringify(
parentV1ReadFromV2Json))

console.log("")
console.log("--- Buffer Writer: ParentV1, Reader: ParentV1 ---")
const parentV1Buffer = parentV1Type.toBuffer(parentV1)
console.log("parentV1Buffer: ", parentV1Buffer)
const parentV1ReadFromV1Buffer = parentV1Type.fromBuffer(parentV1Buffer)
console.log("parentV1ReadFromV1Buffer: ", JSON.stringify(
parentV1ReadFromV1Buffer))

console.log("")
console.log("--- Buffer Writer: ParentV2, Reader: ParentV2 ---")
const parentV2Buffer = parentV2Type.toBuffer(parentV2)
console.log("parentV2Buffer: ", parentV2Buffer)
const parentV2ReadFromV2Buffer = parentV2Type.fromBuffer(parentV2Buffer)
console.log("parentV2ReadFromV1Buffer: ", JSON.stringify(
parentV2ReadFromV2Buffer))

console.log("")
console.log("--- Buffer Writer: ParentV1, Reader: ParentV2 ---")
console.log("parentV1Buffer: ", parentV1Buffer)
try {
const parentV2ReadFromV1Buffer = parentV2Type.fromBuffer(parentV1Buffer)
console.log("parentV2ReadFromV1Buffer: ", JSON.stringify(
parentV2ReadFromV1Buffer))
} catch (e) {
console.log("parentV2ReadFromV1Buffer: ERROR ", e.message)
}

console.log("")
console.log("--- Buffer Writer: ParentV2, Reader: ParentV1 ---")
console.log("parentV2Buffer: ", parentV2Buffer)
try {
const parentV1ReadFromV2Buffer = parentV1Type.fromBuffer(parentV2Buffer)
console.log("parentV1ReadFromV2Buffer: ", JSON.stringify(
parentV1ReadFromV2Buffer))
} catch (e) {
console.log("parentV1ReadFromV2Buffer: ERROR ", e.message)
}

console.log("")
console.log("--- JSON Writer: meWithParentV1, Reader: meWithParentV2 ---")
const meWithParentV1Json = meWithParentV1Type.toString(meWithParentV1)
console.log("meWithParentV1Json: ", meWithParentV1Json)
const meWithParentV2ReadFromV1Json = meWithParentV2Type.fromString(
meWithParentV1Json)
console.log("meWithParentV2ReadFromV1Json: ", JSON.stringify(
meWithParentV2ReadFromV1Json))

console.log("")
console.log("--- JSON Writer: meWithParentV2, Reader: meWithParentV1 ---")
const meWithParentV2Json = meWithParentV2Type.toString(meWithParentV2)
console.log("meWithParentV2Json: ", meWithParentV2Json)
const meWithParentV1ReadFromV2Json = meWithParentV1Type.fromString(
meWithParentV2Json)
console.log("meWithParentV1ReadFromV2Json: ", JSON.stringify(
meWithParentV1ReadFromV2Json))

console.log("")
console.log("--- Buffer Writer: meWithParentV1, Reader: meWithParentV1 ---")
const meWithParentV1Buffer = meWithParentV1Type.toBuffer(meWithParentV1)
console.log("meWithParentV1Buffer: ", meWithParentV1Buffer)
const meWithParentV1ReadFromV1Buffer = meWithParentV1Type.fromBuffer(
meWithParentV1Buffer)
console.log("meWithParentV1ReadFromV1Buffer: ", JSON.stringify(
meWithParentV1ReadFromV1Buffer))

console.log("")
console.log("--- Buffer Writer: meWithParentV2, Reader: meWithParentV2 ---")
const meWithParentV2Buffer = meWithParentV2Type.toBuffer(meWithParentV2)
console.log("meWithParentV2Buffer: ", meWithParentV2Buffer)
const meWithParentV2ReadFromV2Buffer = meWithParentV2Type.fromBuffer(
meWithParentV2Buffer)
console.log("meWithParentV2ReadFromV2Buffer: ", JSON.stringify(
meWithParentV2ReadFromV2Buffer))

console.log("")
console.log("--- Buffer Writer: meWithParentV1, Reader: meWithParentV2 ---")
console.log("meWithParentV1Buffer: ", meWithParentV1Buffer)
try {
const meWithParentV2ReadFromV1Buffer = meWithParentV2Type.fromBuffer(
meWithParentV1Buffer)
console.log("meWithParentV2ReadFromV1Buffer: ", JSON.stringify(
meWithParentV2ReadFromV1Buffer))
} catch (e) {
console.log("meWithParentV2ReadFromV1Buffer: ERROR ", e.message)
}

console.log("")
console.log("--- Buffer Writer: meWithParentV2, Reader: meWithParentV1 ---")
console.log("meWithParentV2Buffer: ", meWithParentV2Buffer)
try {
const meWithParentV1ReadFromV2Buffer = meWithParentV1Type.fromBuffer(
meWithParentV2Buffer)
console.log("meWithParentV1ReadFromV2Buffer: ", JSON.stringify(
meWithParentV1ReadFromV2Buffer))
} catch (e) {
console.log("meWithParentV1ReadFromV2Buffer: ERROR ", e.message)
}


If this is indeed a bug, how do I create a ticket in the Jira board to
report it? Thanks.

Matthew

Re: Possible bug with avro-js with binary serdes and schema resolution

Posted by Ryan Skraba <ry...@skraba.com>.
Hey -- thanks for reaching out!  This is the right place to reach out
for JIRA account creation -- we're still adjusting to the new changes,
but I can create an account for you.

I'll send an email with some details to confirm.

Ryan

On Wed, Dec 7, 2022 at 10:26 AM Martin Grigorov <mg...@apache.org> wrote:
>
> Hi Matthew,
>
> Since you are not subscribed to the mailing list your messages have to be
> moderated.
> I didn't notice anything from you at user@avro.a.o
>
> About JIRA - you can ask for an account here (dev@avro.a.o) or
> private@avro.a.o. We need an username, display name and email address to
> create one for you.
>
> Martin
>
>
> On Wed, Dec 7, 2022 at 10:36 AM Matthew Chng <ma...@shopback.com.invalid>
> wrote:
>
> > Hi,
> >
> > Could not reach the users mailing list so testing the dev mailing list.
> > Can't find contact email for Jira account creation. Can't join Slack
> > because I have no ASF email account.
> >
> > Issue described below.
> >
> > Thanks,
> > Matthew
> >
> > ---------- Forwarded message ---------
> > From: Matthew Chng <ma...@shopback.com>
> > Date: Wed, Dec 7, 2022 at 4:18 PM
> > Subject: Possible bug with avro-js with binary serdes and schema resolution
> > To: <us...@avro.apache.org>
> >
> >
> > Hi all,
> > I am encountering an issue with the avro.js' NPM module where Avro
> > serialized into binary buffers are not readable by a different but
> > compatible reader schema (evolved). This issue is only occurring when using
> > the `toBuffer()` and `fromBuffer()` methods and works as expected when
> > using the `toString()` and `fromString()` JSON serdes methods.
> > The following is an example of an evolving schema with the difference being
> > the additional `gender` field that has a default value.
> >
> > const parentV1Type = avro.parse({
> > name: 'Parent',
> > type: 'record',
> > fields: [
> > { name: 'name', type: 'string' }
> > ]
> > })
> >
> > const parentV2Type = avro.parse({
> > name: 'Parent',
> > type: 'record',
> > fields: [
> > { name: 'name', type: 'string' },
> > { name: 'gender', type: 'string', default: 'unspecified' }
> > ]
> > })
> >
> > According to
> > https://avro.apache.org/docs/1.11.1/specification/#schema-resolution
> > they should be both backwards/forwards reader compatible.
> > They have these properties.
> >
> >    - both schemas are records with the same (unqualified) name
> >    - if the writer’s record contains a field with a name not present in the
> >    reader’s record, the writer’s value for that field is ignored.
> >    - if the reader’s record schema has a field that contains a default
> >    value, and writer’s schema does not have a field with the same name,
> > then
> >    the reader should use the default value from its field.
> >
> > Testing with the `toString()` and `fromString()` JSON serdes methods
> > indicated as such. I've created a simple test script to produce the issue.
> > Also included test with nested schema. The script is included after the
> > output. The errors encountered are either
> >
> >    - truncated buffer; or
> >    - trailing data
> >
> > Script output:
> >
> >
> > --- JSON Writer: ParentV1, Reader: ParentV2 ---
> > parentV1Json:            {"name":"David"}
> > parentV2ReadFromV1Json:  {"name":"David","gender":"unspecified"}
> >
> > --- JSON Writer: ParentV2, Reader: ParentV1 ---
> > parentV2Json:            {"name":"David","gender":"Father"}
> > parentV1ReadFromV2Json:  {"name":"David"}
> >
> > --- Buffer Writer: ParentV1, Reader: ParentV1 ---
> > parentV1Buffer:            <Buffer 0a 44 61 76 69 64>
> > parentV1ReadFromV1Buffer:  {"name":"David"}
> >
> > --- Buffer Writer: ParentV2, Reader: ParentV2 ---
> > parentV2Buffer:            <Buffer 0a 44 61 76 69 64 0c 46 61 74 68 65 72>
> > parentV2ReadFromV1Buffer:  {"name":"David","gender":"Father"}
> >
> > --- Buffer Writer: ParentV1, Reader: ParentV2 ---
> > parentV1Buffer:           <Buffer 0a 44 61 76 69 64>
> > parentV2ReadFromV1Buffer: ERROR  truncated buffer
> >
> > --- Buffer Writer: ParentV2, Reader: ParentV1 ---
> > parentV2Buffer:           <Buffer 0a 44 61 76 69 64 0c 46 61 74 68 65 72>
> > parentV1ReadFromV2Buffer: ERROR  trailing data
> >
> > --- JSON Writer: meWithParentV1, Reader: meWithParentV2 ---
> > meWithParentV1Json:
> > {"name":"Davidson","parent":{"name":"David"}}
> > meWithParentV2ReadFromV1Json:
> >  {"name":"Davidson","parent":{"name":"David","gender":"unspecified"}}
> >
> > --- JSON Writer: meWithParentV2, Reader: meWithParentV1 ---
> > meWithParentV2Json:
> >  {"name":"Davidson","parent":{"name":"David","gender":"Father"}}
> > meWithParentV1ReadFromV2Json:
> > {"name":"Davidson","parent":{"name":"David"}}
> >
> > --- Buffer Writer: meWithParentV1, Reader: meWithParentV1 ---
> > meWithParentV1Buffer:            <Buffer 10 44 61 76 69 64 73 6f 6e 0a 44
> > 61 76 69 64>
> > meWithParentV1ReadFromV1Buffer:
> >  {"name":"Davidson","parent":{"name":"David"}}
> >
> > --- Buffer Writer: meWithParentV2, Reader: meWithParentV2 ---
> > meWithParentV2Buffer:            <Buffer 10 44 61 76 69 64 73 6f 6e 0a 44
> > 61 76 69 64 0c 46 61 74 68 65 72>
> > meWithParentV2ReadFromV2Buffer:
> >  {"name":"Davidson","parent":{"name":"David","gender":"Father"}}
> >
> > --- Buffer Writer: meWithParentV1, Reader: meWithParentV2 ---
> > meWithParentV1Buffer:           <Buffer 10 44 61 76 69 64 73 6f 6e 0a 44 61
> > 76 69 64>
> > meWithParentV2ReadFromV1Buffer: ERROR  truncated buffer
> >
> > --- Buffer Writer: meWithParentV2, Reader: meWithParentV1 ---
> > meWithParentV2Buffer:           <Buffer 10 44 61 76 69 64 73 6f 6e 0a 44 61
> > 76 69 64 0c 46 61 74 68 65 72>
> > meWithParentV1ReadFromV2Buffer: ERROR  trailing data
> >
> >
> > Script src:
> >
> > const avro = require('avro-js')
> > const { writer } = require('repl')
> >
> > const parentV1Type = avro.parse({
> > name: 'Parent',
> > type: 'record',
> > fields: [
> > { name: 'name', type: 'string' }
> > ]
> > })
> >
> > const parentV2Type = avro.parse({
> > name: 'Parent',
> > type: 'record',
> > fields: [
> > { name: 'name', type: 'string' },
> > { name: 'gender', type: 'string', default: 'unspecified' }
> > ]
> > })
> >
> > const meSchema = {
> > name: 'Me',
> > type: 'record',
> > fields: [
> > { name: 'name', type: 'string' },
> > { name: 'parent', type: 'Parent' }
> > ]
> > }
> >
> > const meWithParentV2Type = avro.parse(meSchema, {
> > registry: {
> > Parent: parentV2Type
> > }
> > })
> >
> > const meWithParentV1Type = avro.parse(meSchema, {
> > registry: {
> > Parent: parentV1Type
> > }
> > })
> >
> > const parentV1 = { name: 'David'}
> > const parentV2 = { name: 'David', gender: 'Father' }
> >
> > const meWithParentV1 = {
> > name: 'Davidson',
> > parent: parentV1
> > }
> >
> > const meWithParentV2 = {
> > name: 'Davidson',
> > parent: parentV2
> > }
> >
> > console.log("")
> > console.log("--- JSON Writer: ParentV1, Reader: ParentV2 ---")
> > const parentV1Json = parentV1Type.toString(parentV1)
> > console.log("parentV1Json: ", parentV1Json)
> > const parentV2ReadFromV1Json = parentV2Type.fromString(parentV1Json)
> > console.log("parentV2ReadFromV1Json: ", JSON.stringify(
> > parentV2ReadFromV1Json))
> >
> > console.log("")
> > console.log("--- JSON Writer: ParentV2, Reader: ParentV1 ---")
> > const parentV2Json = parentV2Type.toString(parentV2)
> > console.log("parentV2Json: ", parentV2Json)
> > const parentV1ReadFromV2Json = parentV1Type.fromString(parentV2Json)
> > console.log("parentV1ReadFromV2Json: ", JSON.stringify(
> > parentV1ReadFromV2Json))
> >
> > console.log("")
> > console.log("--- Buffer Writer: ParentV1, Reader: ParentV1 ---")
> > const parentV1Buffer = parentV1Type.toBuffer(parentV1)
> > console.log("parentV1Buffer: ", parentV1Buffer)
> > const parentV1ReadFromV1Buffer = parentV1Type.fromBuffer(parentV1Buffer)
> > console.log("parentV1ReadFromV1Buffer: ", JSON.stringify(
> > parentV1ReadFromV1Buffer))
> >
> > console.log("")
> > console.log("--- Buffer Writer: ParentV2, Reader: ParentV2 ---")
> > const parentV2Buffer = parentV2Type.toBuffer(parentV2)
> > console.log("parentV2Buffer: ", parentV2Buffer)
> > const parentV2ReadFromV2Buffer = parentV2Type.fromBuffer(parentV2Buffer)
> > console.log("parentV2ReadFromV1Buffer: ", JSON.stringify(
> > parentV2ReadFromV2Buffer))
> >
> > console.log("")
> > console.log("--- Buffer Writer: ParentV1, Reader: ParentV2 ---")
> > console.log("parentV1Buffer: ", parentV1Buffer)
> > try {
> > const parentV2ReadFromV1Buffer = parentV2Type.fromBuffer(parentV1Buffer)
> > console.log("parentV2ReadFromV1Buffer: ", JSON.stringify(
> > parentV2ReadFromV1Buffer))
> > } catch (e) {
> > console.log("parentV2ReadFromV1Buffer: ERROR ", e.message)
> > }
> >
> > console.log("")
> > console.log("--- Buffer Writer: ParentV2, Reader: ParentV1 ---")
> > console.log("parentV2Buffer: ", parentV2Buffer)
> > try {
> > const parentV1ReadFromV2Buffer = parentV1Type.fromBuffer(parentV2Buffer)
> > console.log("parentV1ReadFromV2Buffer: ", JSON.stringify(
> > parentV1ReadFromV2Buffer))
> > } catch (e) {
> > console.log("parentV1ReadFromV2Buffer: ERROR ", e.message)
> > }
> >
> > console.log("")
> > console.log("--- JSON Writer: meWithParentV1, Reader: meWithParentV2 ---")
> > const meWithParentV1Json = meWithParentV1Type.toString(meWithParentV1)
> > console.log("meWithParentV1Json: ", meWithParentV1Json)
> > const meWithParentV2ReadFromV1Json = meWithParentV2Type.fromString(
> > meWithParentV1Json)
> > console.log("meWithParentV2ReadFromV1Json: ", JSON.stringify(
> > meWithParentV2ReadFromV1Json))
> >
> > console.log("")
> > console.log("--- JSON Writer: meWithParentV2, Reader: meWithParentV1 ---")
> > const meWithParentV2Json = meWithParentV2Type.toString(meWithParentV2)
> > console.log("meWithParentV2Json: ", meWithParentV2Json)
> > const meWithParentV1ReadFromV2Json = meWithParentV1Type.fromString(
> > meWithParentV2Json)
> > console.log("meWithParentV1ReadFromV2Json: ", JSON.stringify(
> > meWithParentV1ReadFromV2Json))
> >
> > console.log("")
> > console.log("--- Buffer Writer: meWithParentV1, Reader: meWithParentV1
> > ---")
> > const meWithParentV1Buffer = meWithParentV1Type.toBuffer(meWithParentV1)
> > console.log("meWithParentV1Buffer: ", meWithParentV1Buffer)
> > const meWithParentV1ReadFromV1Buffer = meWithParentV1Type.fromBuffer(
> > meWithParentV1Buffer)
> > console.log("meWithParentV1ReadFromV1Buffer: ", JSON.stringify(
> > meWithParentV1ReadFromV1Buffer))
> >
> > console.log("")
> > console.log("--- Buffer Writer: meWithParentV2, Reader: meWithParentV2
> > ---")
> > const meWithParentV2Buffer = meWithParentV2Type.toBuffer(meWithParentV2)
> > console.log("meWithParentV2Buffer: ", meWithParentV2Buffer)
> > const meWithParentV2ReadFromV2Buffer = meWithParentV2Type.fromBuffer(
> > meWithParentV2Buffer)
> > console.log("meWithParentV2ReadFromV2Buffer: ", JSON.stringify(
> > meWithParentV2ReadFromV2Buffer))
> >
> > console.log("")
> > console.log("--- Buffer Writer: meWithParentV1, Reader: meWithParentV2
> > ---")
> > console.log("meWithParentV1Buffer: ", meWithParentV1Buffer)
> > try {
> > const meWithParentV2ReadFromV1Buffer = meWithParentV2Type.fromBuffer(
> > meWithParentV1Buffer)
> > console.log("meWithParentV2ReadFromV1Buffer: ", JSON.stringify(
> > meWithParentV2ReadFromV1Buffer))
> > } catch (e) {
> > console.log("meWithParentV2ReadFromV1Buffer: ERROR ", e.message)
> > }
> >
> > console.log("")
> > console.log("--- Buffer Writer: meWithParentV2, Reader: meWithParentV1
> > ---")
> > console.log("meWithParentV2Buffer: ", meWithParentV2Buffer)
> > try {
> > const meWithParentV1ReadFromV2Buffer = meWithParentV1Type.fromBuffer(
> > meWithParentV2Buffer)
> > console.log("meWithParentV1ReadFromV2Buffer: ", JSON.stringify(
> > meWithParentV1ReadFromV2Buffer))
> > } catch (e) {
> > console.log("meWithParentV1ReadFromV2Buffer: ERROR ", e.message)
> > }
> >
> >
> > If this is indeed a bug, how do I create a ticket in the Jira board to
> > report it? Thanks.
> >
> > Matthew
> >

Re: Possible bug with avro-js with binary serdes and schema resolution

Posted by Martin Grigorov <mg...@apache.org>.
Hi Matthew,

Since you are not subscribed to the mailing list your messages have to be
moderated.
I didn't notice anything from you at user@avro.a.o

About JIRA - you can ask for an account here (dev@avro.a.o) or
private@avro.a.o. We need an username, display name and email address to
create one for you.

Martin


On Wed, Dec 7, 2022 at 10:36 AM Matthew Chng <ma...@shopback.com.invalid>
wrote:

> Hi,
>
> Could not reach the users mailing list so testing the dev mailing list.
> Can't find contact email for Jira account creation. Can't join Slack
> because I have no ASF email account.
>
> Issue described below.
>
> Thanks,
> Matthew
>
> ---------- Forwarded message ---------
> From: Matthew Chng <ma...@shopback.com>
> Date: Wed, Dec 7, 2022 at 4:18 PM
> Subject: Possible bug with avro-js with binary serdes and schema resolution
> To: <us...@avro.apache.org>
>
>
> Hi all,
> I am encountering an issue with the avro.js' NPM module where Avro
> serialized into binary buffers are not readable by a different but
> compatible reader schema (evolved). This issue is only occurring when using
> the `toBuffer()` and `fromBuffer()` methods and works as expected when
> using the `toString()` and `fromString()` JSON serdes methods.
> The following is an example of an evolving schema with the difference being
> the additional `gender` field that has a default value.
>
> const parentV1Type = avro.parse({
> name: 'Parent',
> type: 'record',
> fields: [
> { name: 'name', type: 'string' }
> ]
> })
>
> const parentV2Type = avro.parse({
> name: 'Parent',
> type: 'record',
> fields: [
> { name: 'name', type: 'string' },
> { name: 'gender', type: 'string', default: 'unspecified' }
> ]
> })
>
> According to
> https://avro.apache.org/docs/1.11.1/specification/#schema-resolution
> they should be both backwards/forwards reader compatible.
> They have these properties.
>
>    - both schemas are records with the same (unqualified) name
>    - if the writer’s record contains a field with a name not present in the
>    reader’s record, the writer’s value for that field is ignored.
>    - if the reader’s record schema has a field that contains a default
>    value, and writer’s schema does not have a field with the same name,
> then
>    the reader should use the default value from its field.
>
> Testing with the `toString()` and `fromString()` JSON serdes methods
> indicated as such. I've created a simple test script to produce the issue.
> Also included test with nested schema. The script is included after the
> output. The errors encountered are either
>
>    - truncated buffer; or
>    - trailing data
>
> Script output:
>
>
> --- JSON Writer: ParentV1, Reader: ParentV2 ---
> parentV1Json:            {"name":"David"}
> parentV2ReadFromV1Json:  {"name":"David","gender":"unspecified"}
>
> --- JSON Writer: ParentV2, Reader: ParentV1 ---
> parentV2Json:            {"name":"David","gender":"Father"}
> parentV1ReadFromV2Json:  {"name":"David"}
>
> --- Buffer Writer: ParentV1, Reader: ParentV1 ---
> parentV1Buffer:            <Buffer 0a 44 61 76 69 64>
> parentV1ReadFromV1Buffer:  {"name":"David"}
>
> --- Buffer Writer: ParentV2, Reader: ParentV2 ---
> parentV2Buffer:            <Buffer 0a 44 61 76 69 64 0c 46 61 74 68 65 72>
> parentV2ReadFromV1Buffer:  {"name":"David","gender":"Father"}
>
> --- Buffer Writer: ParentV1, Reader: ParentV2 ---
> parentV1Buffer:           <Buffer 0a 44 61 76 69 64>
> parentV2ReadFromV1Buffer: ERROR  truncated buffer
>
> --- Buffer Writer: ParentV2, Reader: ParentV1 ---
> parentV2Buffer:           <Buffer 0a 44 61 76 69 64 0c 46 61 74 68 65 72>
> parentV1ReadFromV2Buffer: ERROR  trailing data
>
> --- JSON Writer: meWithParentV1, Reader: meWithParentV2 ---
> meWithParentV1Json:
> {"name":"Davidson","parent":{"name":"David"}}
> meWithParentV2ReadFromV1Json:
>  {"name":"Davidson","parent":{"name":"David","gender":"unspecified"}}
>
> --- JSON Writer: meWithParentV2, Reader: meWithParentV1 ---
> meWithParentV2Json:
>  {"name":"Davidson","parent":{"name":"David","gender":"Father"}}
> meWithParentV1ReadFromV2Json:
> {"name":"Davidson","parent":{"name":"David"}}
>
> --- Buffer Writer: meWithParentV1, Reader: meWithParentV1 ---
> meWithParentV1Buffer:            <Buffer 10 44 61 76 69 64 73 6f 6e 0a 44
> 61 76 69 64>
> meWithParentV1ReadFromV1Buffer:
>  {"name":"Davidson","parent":{"name":"David"}}
>
> --- Buffer Writer: meWithParentV2, Reader: meWithParentV2 ---
> meWithParentV2Buffer:            <Buffer 10 44 61 76 69 64 73 6f 6e 0a 44
> 61 76 69 64 0c 46 61 74 68 65 72>
> meWithParentV2ReadFromV2Buffer:
>  {"name":"Davidson","parent":{"name":"David","gender":"Father"}}
>
> --- Buffer Writer: meWithParentV1, Reader: meWithParentV2 ---
> meWithParentV1Buffer:           <Buffer 10 44 61 76 69 64 73 6f 6e 0a 44 61
> 76 69 64>
> meWithParentV2ReadFromV1Buffer: ERROR  truncated buffer
>
> --- Buffer Writer: meWithParentV2, Reader: meWithParentV1 ---
> meWithParentV2Buffer:           <Buffer 10 44 61 76 69 64 73 6f 6e 0a 44 61
> 76 69 64 0c 46 61 74 68 65 72>
> meWithParentV1ReadFromV2Buffer: ERROR  trailing data
>
>
> Script src:
>
> const avro = require('avro-js')
> const { writer } = require('repl')
>
> const parentV1Type = avro.parse({
> name: 'Parent',
> type: 'record',
> fields: [
> { name: 'name', type: 'string' }
> ]
> })
>
> const parentV2Type = avro.parse({
> name: 'Parent',
> type: 'record',
> fields: [
> { name: 'name', type: 'string' },
> { name: 'gender', type: 'string', default: 'unspecified' }
> ]
> })
>
> const meSchema = {
> name: 'Me',
> type: 'record',
> fields: [
> { name: 'name', type: 'string' },
> { name: 'parent', type: 'Parent' }
> ]
> }
>
> const meWithParentV2Type = avro.parse(meSchema, {
> registry: {
> Parent: parentV2Type
> }
> })
>
> const meWithParentV1Type = avro.parse(meSchema, {
> registry: {
> Parent: parentV1Type
> }
> })
>
> const parentV1 = { name: 'David'}
> const parentV2 = { name: 'David', gender: 'Father' }
>
> const meWithParentV1 = {
> name: 'Davidson',
> parent: parentV1
> }
>
> const meWithParentV2 = {
> name: 'Davidson',
> parent: parentV2
> }
>
> console.log("")
> console.log("--- JSON Writer: ParentV1, Reader: ParentV2 ---")
> const parentV1Json = parentV1Type.toString(parentV1)
> console.log("parentV1Json: ", parentV1Json)
> const parentV2ReadFromV1Json = parentV2Type.fromString(parentV1Json)
> console.log("parentV2ReadFromV1Json: ", JSON.stringify(
> parentV2ReadFromV1Json))
>
> console.log("")
> console.log("--- JSON Writer: ParentV2, Reader: ParentV1 ---")
> const parentV2Json = parentV2Type.toString(parentV2)
> console.log("parentV2Json: ", parentV2Json)
> const parentV1ReadFromV2Json = parentV1Type.fromString(parentV2Json)
> console.log("parentV1ReadFromV2Json: ", JSON.stringify(
> parentV1ReadFromV2Json))
>
> console.log("")
> console.log("--- Buffer Writer: ParentV1, Reader: ParentV1 ---")
> const parentV1Buffer = parentV1Type.toBuffer(parentV1)
> console.log("parentV1Buffer: ", parentV1Buffer)
> const parentV1ReadFromV1Buffer = parentV1Type.fromBuffer(parentV1Buffer)
> console.log("parentV1ReadFromV1Buffer: ", JSON.stringify(
> parentV1ReadFromV1Buffer))
>
> console.log("")
> console.log("--- Buffer Writer: ParentV2, Reader: ParentV2 ---")
> const parentV2Buffer = parentV2Type.toBuffer(parentV2)
> console.log("parentV2Buffer: ", parentV2Buffer)
> const parentV2ReadFromV2Buffer = parentV2Type.fromBuffer(parentV2Buffer)
> console.log("parentV2ReadFromV1Buffer: ", JSON.stringify(
> parentV2ReadFromV2Buffer))
>
> console.log("")
> console.log("--- Buffer Writer: ParentV1, Reader: ParentV2 ---")
> console.log("parentV1Buffer: ", parentV1Buffer)
> try {
> const parentV2ReadFromV1Buffer = parentV2Type.fromBuffer(parentV1Buffer)
> console.log("parentV2ReadFromV1Buffer: ", JSON.stringify(
> parentV2ReadFromV1Buffer))
> } catch (e) {
> console.log("parentV2ReadFromV1Buffer: ERROR ", e.message)
> }
>
> console.log("")
> console.log("--- Buffer Writer: ParentV2, Reader: ParentV1 ---")
> console.log("parentV2Buffer: ", parentV2Buffer)
> try {
> const parentV1ReadFromV2Buffer = parentV1Type.fromBuffer(parentV2Buffer)
> console.log("parentV1ReadFromV2Buffer: ", JSON.stringify(
> parentV1ReadFromV2Buffer))
> } catch (e) {
> console.log("parentV1ReadFromV2Buffer: ERROR ", e.message)
> }
>
> console.log("")
> console.log("--- JSON Writer: meWithParentV1, Reader: meWithParentV2 ---")
> const meWithParentV1Json = meWithParentV1Type.toString(meWithParentV1)
> console.log("meWithParentV1Json: ", meWithParentV1Json)
> const meWithParentV2ReadFromV1Json = meWithParentV2Type.fromString(
> meWithParentV1Json)
> console.log("meWithParentV2ReadFromV1Json: ", JSON.stringify(
> meWithParentV2ReadFromV1Json))
>
> console.log("")
> console.log("--- JSON Writer: meWithParentV2, Reader: meWithParentV1 ---")
> const meWithParentV2Json = meWithParentV2Type.toString(meWithParentV2)
> console.log("meWithParentV2Json: ", meWithParentV2Json)
> const meWithParentV1ReadFromV2Json = meWithParentV1Type.fromString(
> meWithParentV2Json)
> console.log("meWithParentV1ReadFromV2Json: ", JSON.stringify(
> meWithParentV1ReadFromV2Json))
>
> console.log("")
> console.log("--- Buffer Writer: meWithParentV1, Reader: meWithParentV1
> ---")
> const meWithParentV1Buffer = meWithParentV1Type.toBuffer(meWithParentV1)
> console.log("meWithParentV1Buffer: ", meWithParentV1Buffer)
> const meWithParentV1ReadFromV1Buffer = meWithParentV1Type.fromBuffer(
> meWithParentV1Buffer)
> console.log("meWithParentV1ReadFromV1Buffer: ", JSON.stringify(
> meWithParentV1ReadFromV1Buffer))
>
> console.log("")
> console.log("--- Buffer Writer: meWithParentV2, Reader: meWithParentV2
> ---")
> const meWithParentV2Buffer = meWithParentV2Type.toBuffer(meWithParentV2)
> console.log("meWithParentV2Buffer: ", meWithParentV2Buffer)
> const meWithParentV2ReadFromV2Buffer = meWithParentV2Type.fromBuffer(
> meWithParentV2Buffer)
> console.log("meWithParentV2ReadFromV2Buffer: ", JSON.stringify(
> meWithParentV2ReadFromV2Buffer))
>
> console.log("")
> console.log("--- Buffer Writer: meWithParentV1, Reader: meWithParentV2
> ---")
> console.log("meWithParentV1Buffer: ", meWithParentV1Buffer)
> try {
> const meWithParentV2ReadFromV1Buffer = meWithParentV2Type.fromBuffer(
> meWithParentV1Buffer)
> console.log("meWithParentV2ReadFromV1Buffer: ", JSON.stringify(
> meWithParentV2ReadFromV1Buffer))
> } catch (e) {
> console.log("meWithParentV2ReadFromV1Buffer: ERROR ", e.message)
> }
>
> console.log("")
> console.log("--- Buffer Writer: meWithParentV2, Reader: meWithParentV1
> ---")
> console.log("meWithParentV2Buffer: ", meWithParentV2Buffer)
> try {
> const meWithParentV1ReadFromV2Buffer = meWithParentV1Type.fromBuffer(
> meWithParentV2Buffer)
> console.log("meWithParentV1ReadFromV2Buffer: ", JSON.stringify(
> meWithParentV1ReadFromV2Buffer))
> } catch (e) {
> console.log("meWithParentV1ReadFromV2Buffer: ERROR ", e.message)
> }
>
>
> If this is indeed a bug, how do I create a ticket in the Jira board to
> report it? Thanks.
>
> Matthew
>